AGI persists not because it’s a coherent scientific objective, but because it functions as a lucrative mythology perfectly aligned with VC expectations...Next step...analyze AGI not just as bad science, but as good branding.
Are we sure that is what is happening? Can you really do any meaningful "science" when the subject understudy is a black box that is under a shroud of secrecy? What has been learned from LLMs regarding human cognition and is there broad convergence on that view?
It's not the main driver of what's happening but it's an aspect of it that goes back a way. For example Turing writing in 1946:
>I am more interested in the possibility of producing models of the action of the brain than in the applications to practical computing...although the brain may in fact operate by changing its neuron circuits by the growth of axons and dendrites, we could nevertheless make a model... https://en.wikipedia.org/wiki/Unorganized_machine
Oh man I was not aware of this aspect of Turing's work thank you for sharing!!
Honestly, trying to reverse engineering something to understand how it works is interesting and potentially worthwhile! To me it's obvious that "broadly mechanistic" or causal explanations of specific cognitive functions can be created. I am not doubting that a "machine" can mimic human cognitive abilities -insofar as we can state them or "tokenize" them precisely. I am pretty sure that is the whole basis of Cognitive Science.
But just because we can mimic those capacities: does that imply that those are the same mechanisms that exist in nature? Herbert Simon made a distinction between "Natural" and "artificial" system: an LLM's function is to model language (and they do a damn good job of that!) does the brain have one function and what is it? If you build a submarine does that mean it tells you something about how fish swim? Even if it swims faster than any of the fish?
Building models can help you understand things. Maybe not so much submarines but building model aircraft and studying aerodynamics definitely helps understand how birds fly.
Artificial neural networks are already helping some understanding of brains for example there was a lot of debate about "universal grammar":
>humans possess an innate, biological predisposition for language acquisition, including a "Language Acquisition Device"...
and it now seems to be demonstrated that LLM like neural networks are quite good at picking up language without an 'acquisition device' beyond the general network.
That is a fair point. I do not disagree that building (tenuous at best) models of Neurons can help inform science and engineering and vice-versa. Much of "classic" digital signal processing and image processing was an interplay between psychologist, engineers, neuroscientists etc.. So that is very useful! But what it we have here is mistaking the airplane for the bird! My pet Parrot doesn't have an engine! The map is not the territory as it is said.
The point of this thread and the paper isn't that cognition is not an important goal to understand nor that it isn't computational (computation seems to be the best model we currently have).
But that AGI is (as the previous comment mentioned) a Marketing term of little scientific value. It is too vague and has the baggage of some religious belief than cold hard scientific inquiry. It used to just be called "AI" or as was being debated at the infancy of the field just "complex information processing". The current for-profit (let's be clear OpenAI is not really a charity) companies don't really actually care about understanding anything ... to an outsider they appear to maximize hype to drum up investment so that they could build a God, while some people get very very rich. To many in these communities, intelligence is some magical quantity that can "solve everything!" I am not sure which part of those beliefs are scientific? Why are we ear marking $100s of billions (some of which is public money) to benefit these companies?
>humans possess an innate, biological predisposition for language acquisition, including a "Language Acquisition Device"...
Would you say that one day someone just happened to find an LLM chilling under the sun and we spoke some words to it for like a few years by pointing to things and one day it was speaking full sentences and asking about the world? Or is it that a lot of engineering work was put into specifically design something for the purpose of generating text ... Do you think humans were designed to speak or to be intelligent and by whom? Can Dolphins, Gorilla's, and Elephants also speak language? They have complex brains with a lot of neurons. Chomsky’s point was just that “If Human then can speak language” so “not human can speak language” doesn’t refute the central point. I am no expert on Chomsky you may know much more about that. But again doesn’t seem relevant to the actual thread.
So TLDR: I am not sure we learned a lot about how humans learn language with LLMs: all we learned as that it can be done by "something" but we already knew that. These specific technologies are Products designed to sell things and they need that hype for that. But it doesn't take away from the fact that they are freaking cool!
I'm not sure that there haven't been some things we learnt about cognition or (some) cognition-having entities in general. Whether LLMs inner diction overlaps with how humans do it, we now know more about the subject itself.
No. I am saying that the broader scientific community probably cannot run experiments on ChatGPT, Claude, or Gemini as they would be able to on say a mouse's brain or even on human subjects with carefully controlled experiments and that can be replicated by 3rd parties.
As for "understanding" you have to be more precise about what you mean: we created LLMs and Transformer based ANNs (and ANNs themselves) and it appears we are all mystified by what they can do ... as though they are magic ... and will lead to Super-intelligence (an even more poorly defined term than regular-ass intelligence).
I'm not trying to be difficult: but I sometimes wonder if all of us were to take a step back and really try and understand this tech before jumping to conclusions! "The thing that was designed to be a universal function approximator approximates the function we trained it to approximate! HOLY CRAP WE MAY HAVE MADE GOD!" It's clear that the the technologies we currently have are miraculous and do amazing things! But are they really doing exactly what humans do? Is it possible to converge at similar destinations without taking the same route? Are we even at the exact same destination?
Yes I know of this "study" AFAIK it has not been subjected to peer-review and uses a lot of suggestive language. Other studies have shown that these things use large bags of heuristics which isn't surprising given that are trained on unimaginably large amounts of tokens.
I am not an expert ... but to me anything that is associated with these companies is marketing. I understand that makes me a "stick in the mud" but it's not a crime to be skeptical! THAT SHOULD BE THE DEFAULT ... we used to believe in gods, demons, and monsters. Given that Anthropic is very very closely related to EA and Longtermism and given that this is the "slickest" paper I have ever read ...
If I had the mental capacity to have read a good amount of the internet and millions of pirated books ... I wouldn't be confused by perturbations in questions I have already previously seen.
I am sure there are lots of cogent rebuttals to what I am saying and hey maybe I'm just a sack of meat that is miffed about being replaced by a "superior intelligence" that is "more evolved". But that isn't how evolution works either and it's troubling to see that sentiment becoming to prevalent.
I highly doubt any company are just focusing on AGI, including OpenAI. Else they wouldn't keep on releasing 5 versions of 4o with different "personality".
Humans are a general intelligence, yet the vast majority of us have or own personality. Unless you're thinking of a superintelligence that can simulate any personality it wants.
Tweaking the "personality" of LLMs has nothing to do with how smart they are. And using lists or emojis more doesn't make them more intelligent. They just increase the usage as people will like talking to them more.
Introducing the two bit weight! Now you can pack all your uniform greyzones into the variable name. Save memory, process yoir dara faster on smaller chips! We can retrain them we have the technology !
I'll explain why very simply. The vision of AI and the vision of Virtual Reality all existed well before the technology. We envisioned humanoid robots well before we ever had a chance of making them. We also envisioned an all-knowing AI well before we had our current technology. We will continue to envision the end-state because it is the most natural conclusion. No human can not not imagine the inevitable. That every human, technical or not, has the capacity to fully imagine this future, which means the entirety of the human race will be directed to this forgone conclusion.
Some marketer decided to call this stuff AI precisely because they wanted to make the connection to those grand visions that you're talking about.
If instead we called them what they are, Large Language Models, would you still say that they were hurtling inevitably towards Generalized Intelligence?
AGI persists not because it’s a coherent scientific objective, but because it functions as a lucrative mythology perfectly aligned with VC expectations...Next step...analyze AGI not just as bad science, but as good branding.
It seems something like a scientific objective. To understand human thinking try making a machine that can do it.
Are we sure that is what is happening? Can you really do any meaningful "science" when the subject understudy is a black box that is under a shroud of secrecy? What has been learned from LLMs regarding human cognition and is there broad convergence on that view?
It's not the main driver of what's happening but it's an aspect of it that goes back a way. For example Turing writing in 1946:
>I am more interested in the possibility of producing models of the action of the brain than in the applications to practical computing...although the brain may in fact operate by changing its neuron circuits by the growth of axons and dendrites, we could nevertheless make a model... https://en.wikipedia.org/wiki/Unorganized_machine
Oh man I was not aware of this aspect of Turing's work thank you for sharing!!
Honestly, trying to reverse engineering something to understand how it works is interesting and potentially worthwhile! To me it's obvious that "broadly mechanistic" or causal explanations of specific cognitive functions can be created. I am not doubting that a "machine" can mimic human cognitive abilities -insofar as we can state them or "tokenize" them precisely. I am pretty sure that is the whole basis of Cognitive Science.
But just because we can mimic those capacities: does that imply that those are the same mechanisms that exist in nature? Herbert Simon made a distinction between "Natural" and "artificial" system: an LLM's function is to model language (and they do a damn good job of that!) does the brain have one function and what is it? If you build a submarine does that mean it tells you something about how fish swim? Even if it swims faster than any of the fish?
Building models can help you understand things. Maybe not so much submarines but building model aircraft and studying aerodynamics definitely helps understand how birds fly.
Artificial neural networks are already helping some understanding of brains for example there was a lot of debate about "universal grammar":
>humans possess an innate, biological predisposition for language acquisition, including a "Language Acquisition Device"...
and it now seems to be demonstrated that LLM like neural networks are quite good at picking up language without an 'acquisition device' beyond the general network.
That is a fair point. I do not disagree that building (tenuous at best) models of Neurons can help inform science and engineering and vice-versa. Much of "classic" digital signal processing and image processing was an interplay between psychologist, engineers, neuroscientists etc.. So that is very useful! But what it we have here is mistaking the airplane for the bird! My pet Parrot doesn't have an engine! The map is not the territory as it is said.
The point of this thread and the paper isn't that cognition is not an important goal to understand nor that it isn't computational (computation seems to be the best model we currently have). But that AGI is (as the previous comment mentioned) a Marketing term of little scientific value. It is too vague and has the baggage of some religious belief than cold hard scientific inquiry. It used to just be called "AI" or as was being debated at the infancy of the field just "complex information processing". The current for-profit (let's be clear OpenAI is not really a charity) companies don't really actually care about understanding anything ... to an outsider they appear to maximize hype to drum up investment so that they could build a God, while some people get very very rich. To many in these communities, intelligence is some magical quantity that can "solve everything!" I am not sure which part of those beliefs are scientific? Why are we ear marking $100s of billions (some of which is public money) to benefit these companies?
>humans possess an innate, biological predisposition for language acquisition, including a "Language Acquisition Device"...
Would you say that one day someone just happened to find an LLM chilling under the sun and we spoke some words to it for like a few years by pointing to things and one day it was speaking full sentences and asking about the world? Or is it that a lot of engineering work was put into specifically design something for the purpose of generating text ... Do you think humans were designed to speak or to be intelligent and by whom? Can Dolphins, Gorilla's, and Elephants also speak language? They have complex brains with a lot of neurons. Chomsky’s point was just that “If Human then can speak language” so “not human can speak language” doesn’t refute the central point. I am no expert on Chomsky you may know much more about that. But again doesn’t seem relevant to the actual thread.
So TLDR: I am not sure we learned a lot about how humans learn language with LLMs: all we learned as that it can be done by "something" but we already knew that. These specific technologies are Products designed to sell things and they need that hype for that. But it doesn't take away from the fact that they are freaking cool!
https://leon.bottou.org/news/two_lessons_from_iclr_2025
I'm not sure that there haven't been some things we learnt about cognition or (some) cognition-having entities in general. Whether LLMs inner diction overlaps with how humans do it, we now know more about the subject itself.
>Can you really do any meaningful "science" when the subject understudy is a black box that is under a shroud of secrecy?
Are you saying it's impossible to understand human brains?
No. I am saying that the broader scientific community probably cannot run experiments on ChatGPT, Claude, or Gemini as they would be able to on say a mouse's brain or even on human subjects with carefully controlled experiments and that can be replicated by 3rd parties.
As for "understanding" you have to be more precise about what you mean: we created LLMs and Transformer based ANNs (and ANNs themselves) and it appears we are all mystified by what they can do ... as though they are magic ... and will lead to Super-intelligence (an even more poorly defined term than regular-ass intelligence).
I'm not trying to be difficult: but I sometimes wonder if all of us were to take a step back and really try and understand this tech before jumping to conclusions! "The thing that was designed to be a universal function approximator approximates the function we trained it to approximate! HOLY CRAP WE MAY HAVE MADE GOD!" It's clear that the the technologies we currently have are miraculous and do amazing things! But are they really doing exactly what humans do? Is it possible to converge at similar destinations without taking the same route? Are we even at the exact same destination?
People are trying to run experiments on Claude - see https://news.ycombinator.com/item?id=43495617
Yes I know of this "study" AFAIK it has not been subjected to peer-review and uses a lot of suggestive language. Other studies have shown that these things use large bags of heuristics which isn't surprising given that are trained on unimaginably large amounts of tokens.
I am not an expert ... but to me anything that is associated with these companies is marketing. I understand that makes me a "stick in the mud" but it's not a crime to be skeptical! THAT SHOULD BE THE DEFAULT ... we used to believe in gods, demons, and monsters. Given that Anthropic is very very closely related to EA and Longtermism and given that this is the "slickest" paper I have ever read ...
If I had the mental capacity to have read a good amount of the internet and millions of pirated books ... I wouldn't be confused by perturbations in questions I have already previously seen.
I am sure there are lots of cogent rebuttals to what I am saying and hey maybe I'm just a sack of meat that is miffed about being replaced by a "superior intelligence" that is "more evolved". But that isn't how evolution works either and it's troubling to see that sentiment becoming to prevalent.
I highly doubt any company are just focusing on AGI, including OpenAI. Else they wouldn't keep on releasing 5 versions of 4o with different "personality".
Humans are a general intelligence, yet the vast majority of us have or own personality. Unless you're thinking of a superintelligence that can simulate any personality it wants.
Tweaking the "personality" of LLMs has nothing to do with how smart they are. And using lists or emojis more doesn't make them more intelligent. They just increase the usage as people will like talking to them more.
Google currently has job ads out for post-AGI AI research.
See my post again. I said "just focusing".
No.
Yes.
Introducing the two bit weight! Now you can pack all your uniform greyzones into the variable name. Save memory, process yoir dara faster on smaller chips! We can retrain them we have the technology !
Is it just me, or this title is gross and annoying up to the point that it's straight up trolling?
It is kinda. And reading the abstract it's maybe worse.
Yeah. I also don’t understand why it’s an Arxiv article, rather than a blogpost.
Because papers are increasingly written to catch the attention of news publications/blogs/social media instead of professors/academics/researchers.
Position papers are not a recent phenomenon.
The talent pool has thinned due to oversaturation, isn't that obvious?
We can't.
I'll explain why very simply. The vision of AI and the vision of Virtual Reality all existed well before the technology. We envisioned humanoid robots well before we ever had a chance of making them. We also envisioned an all-knowing AI well before we had our current technology. We will continue to envision the end-state because it is the most natural conclusion. No human can not not imagine the inevitable. That every human, technical or not, has the capacity to fully imagine this future, which means the entirety of the human race will be directed to this forgone conclusion.
Like God and Death (and taxes). shrugs
Smith: It is inevitable Mr. Anderson
Some marketer decided to call this stuff AI precisely because they wanted to make the connection to those grand visions that you're talking about.
If instead we called them what they are, Large Language Models, would you still say that they were hurtling inevitably towards Generalized Intelligence?
Yeah.
Why? How do LLMs amd diffusion models relate to the "vision of AI and the vision of Virtual Reality"
We don't have to build the torment nexus.