The modest power relative to the size of the economy of GPT LLM MAMLMs as linguistic artifacts. Of course, since the economy is super huge, a relatively modest effect on it is still huge one. But...
What Brad/Cosma's explanation of LLM misses (along with most others) -- the reason why Brad's intuition "strongly leads [him] to think that they should not be able to do half as well as they do" -- is the concept of **faithful representation**. Put simply: Reality allows itself to be mapped in a low-dimensional latent space, and some compression schemes are just **good mappings**. [Or at least (the instrumentalist view), of humanity's currently-known shared ways to describe and predict our observations of reality.]
This is why our scientific and statistical models work in the first place. Brute-force ML works because (or to the extent that) it finds these mappings. LLMs work because their training process finds the same latent representations as (or equivalent to) the representations that internet sh*tposters have in their heads as they're writing [1]. Therefore, they can quite accurately predict what a sh*tposter would have said, in pretty much any context.
Yes, this is just Plato's concept of "carving nature at its joints" from ~2400 years ago. Quite surprisingly, it turns out that we're all living at the unique moment in humanity's history where we are discovering that such a carving is an objective possibility, and that we can, for the first time, automate this carving [2, 3]. And this does not just apply to language, BTW: if the success of multimodal models weren't enough, tabular foundation models [4] demonstrate that good old statistics has the same property.
There is still the crucial question of whether/when, beyond "just" the latent representations, ML can learn the true causal world model: which latent states at t cause which latent states at t' > t. This is, in a technical sense, much harder than learning temporal correlation, and at least in some cases, it's **impossible** to learn just from the data, requiring the capability to intervene/experiment [5]. However, it has recently been proven that learning a good causal model is a strong requirement for robust behavior -- ie, reliable extrapolation beyond the data [6].
We've been lucky so far that the Internet already has a lot of natural-language descriptions of causal models of just about everything. These can be compressed into "meta-representations" that let LLMs "interpolate extrapolations". This is similar at some level to how humans learn much of their own extrapolation capability -- not by experimenting themselves, but by learning theory from other people and representing it in their heads as "little stories" that they can interpolate.
However, because these meta-representations can't be directly grounded in observational data, the way LLMs learn them is very sensitive to the quality of the training corpus. That is why stuff like embodied learning and synthetic data are hot topics in AI: This is stuff you want to get *exactly right*.
Regardless, "just" getting the representation right is a huge step in the direction, and it goes a long way to explain/justify the conceptual leap from "just statistics on steroids" to "AGI" made by LLM stans.
A couple of weeks ago I was opening up a stub of a paper I'm working on, a PDF of not fully 8 pages. Acrobat informed me that, "This looks like a long article. Would you like me to summarize it?"
I was appalled that it treated an 8-page piece of writing as "long," but curious what it would come up with for a summary.
Mostly good, but also got one key idea wrong.
If I didn't know the writing (having, after all, written it), I wouldn't have known how much of the summary was on-target and how much was eroneous. Simply knowing that some of the info is incorrect doesn't help - to figure out which is the incorrect stuff, you have to go in and read the article. And then the point of the summary was...?
I suppose one could say that the standard shouldn't be perfection, but rather, is the AI doing better than a graduate research assistant. My college is undergrads only, so I don't have any experience with evaluating the work of a graudate research assistant.
I see thhere are a number of side remarks at the end of Shalizi's article under the heading "O You Who Believe in the Resurrection and the Last Day," notably a link to his recommended science fiction at http://bactra.org/notebooks/sf-recs.html, which I don't recall encountering previously.
Both this and the previous post were very good summations of MAMLMs and their likely impact. The AGI/SI hype creates lots of interest/angst, but is really about raising capital for the hyperscalers. The rest of us are trying to find personally productive uses for teh technology. Finding documents and summarizing them in the aggregate is useful, as long as you can at least spot-check the output. Writing boilerplate code is a time-saver, and we may even write simple applications with vibe-coding. Whether it will be a boon for more than a fraction of teh population, IDK. I once thought that spreadsheets had become a commonplace tool, even if only to generate lists, check receipts, etc. - until I discovered the current generation of undergraduates didn't know how to use a spreadsheet at all.
The use I really wish MAMLMs would solve for me is developing equations from verbal inputs, solving partial differential equations, and other math that is hard for me. But as I have no way to check the output, I cannot rely on it, and Mathematica is far too expensive to justify purchasing.
I applaud the idea that these tools should increase personal welfare, rather than being measured by improvements to GDP. However, like anti-virus software, LLMs may help defend against malware and scam artists, but are their purchases really just the "broken windows" fallacy? I expect that, as Kevin Kelly suggests, their net benefit will be low, but at least positive ("What Technology Wants" (2010)). AIs will further complicate life, not something I want as I get older. I doubt I will see affordable Asimovian robots in my lifetime, but they would be very useful if available and did not require close hand-holding/micromanaging to do a range of household tasks.
What Brad/Cosma's explanation of LLM misses (along with most others) -- the reason why Brad's intuition "strongly leads [him] to think that they should not be able to do half as well as they do" -- is the concept of **faithful representation**. Put simply: Reality allows itself to be mapped in a low-dimensional latent space, and some compression schemes are just **good mappings**. [Or at least (the instrumentalist view), of humanity's currently-known shared ways to describe and predict our observations of reality.]
This is why our scientific and statistical models work in the first place. Brute-force ML works because (or to the extent that) it finds these mappings. LLMs work because their training process finds the same latent representations as (or equivalent to) the representations that internet sh*tposters have in their heads as they're writing [1]. Therefore, they can quite accurately predict what a sh*tposter would have said, in pretty much any context.
Yes, this is just Plato's concept of "carving nature at its joints" from ~2400 years ago. Quite surprisingly, it turns out that we're all living at the unique moment in humanity's history where we are discovering that such a carving is an objective possibility, and that we can, for the first time, automate this carving [2, 3]. And this does not just apply to language, BTW: if the success of multimodal models weren't enough, tabular foundation models [4] demonstrate that good old statistics has the same property.
There is still the crucial question of whether/when, beyond "just" the latent representations, ML can learn the true causal world model: which latent states at t cause which latent states at t' > t. This is, in a technical sense, much harder than learning temporal correlation, and at least in some cases, it's **impossible** to learn just from the data, requiring the capability to intervene/experiment [5]. However, it has recently been proven that learning a good causal model is a strong requirement for robust behavior -- ie, reliable extrapolation beyond the data [6].
We've been lucky so far that the Internet already has a lot of natural-language descriptions of causal models of just about everything. These can be compressed into "meta-representations" that let LLMs "interpolate extrapolations". This is similar at some level to how humans learn much of their own extrapolation capability -- not by experimenting themselves, but by learning theory from other people and representing it in their heads as "little stories" that they can interpolate.
However, because these meta-representations can't be directly grounded in observational data, the way LLMs learn them is very sensitive to the quality of the training corpus. That is why stuff like embodied learning and synthetic data are hot topics in AI: This is stuff you want to get *exactly right*.
Regardless, "just" getting the representation right is a huge step in the direction, and it goes a long way to explain/justify the conceptual leap from "just statistics on steroids" to "AGI" made by LLM stans.
[1] https://arxiv.org/abs/2405.07987
[2] https://arxiv.org/abs/2505.12540v2
[3] https://aiprospects.substack.com/p/llms-and-beyond-all-roads-lead-to
[4] https://www.nature.com/articles/s41586-024-08328-6
[5] https://smithamilli.com/blog/causal-ladder/
[6] https://arxiv.org/abs/2402.10877
A couple of weeks ago I was opening up a stub of a paper I'm working on, a PDF of not fully 8 pages. Acrobat informed me that, "This looks like a long article. Would you like me to summarize it?"
I was appalled that it treated an 8-page piece of writing as "long," but curious what it would come up with for a summary.
Mostly good, but also got one key idea wrong.
If I didn't know the writing (having, after all, written it), I wouldn't have known how much of the summary was on-target and how much was eroneous. Simply knowing that some of the info is incorrect doesn't help - to figure out which is the incorrect stuff, you have to go in and read the article. And then the point of the summary was...?
I suppose one could say that the standard shouldn't be perfection, but rather, is the AI doing better than a graduate research assistant. My college is undergrads only, so I don't have any experience with evaluating the work of a graudate research assistant.
I see thhere are a number of side remarks at the end of Shalizi's article under the heading "O You Who Believe in the Resurrection and the Last Day," notably a link to his recommended science fiction at http://bactra.org/notebooks/sf-recs.html, which I don't recall encountering previously.
For better or worse, that section (and its links) have been there since the first iteration of the notebook in 2023.
Both this and the previous post were very good summations of MAMLMs and their likely impact. The AGI/SI hype creates lots of interest/angst, but is really about raising capital for the hyperscalers. The rest of us are trying to find personally productive uses for teh technology. Finding documents and summarizing them in the aggregate is useful, as long as you can at least spot-check the output. Writing boilerplate code is a time-saver, and we may even write simple applications with vibe-coding. Whether it will be a boon for more than a fraction of teh population, IDK. I once thought that spreadsheets had become a commonplace tool, even if only to generate lists, check receipts, etc. - until I discovered the current generation of undergraduates didn't know how to use a spreadsheet at all.
The use I really wish MAMLMs would solve for me is developing equations from verbal inputs, solving partial differential equations, and other math that is hard for me. But as I have no way to check the output, I cannot rely on it, and Mathematica is far too expensive to justify purchasing.
I applaud the idea that these tools should increase personal welfare, rather than being measured by improvements to GDP. However, like anti-virus software, LLMs may help defend against malware and scam artists, but are their purchases really just the "broken windows" fallacy? I expect that, as Kevin Kelly suggests, their net benefit will be low, but at least positive ("What Technology Wants" (2010)). AIs will further complicate life, not something I want as I get older. I doubt I will see affordable Asimovian robots in my lifetime, but they would be very useful if available and did not require close hand-holding/micromanaging to do a range of household tasks.