Understanding Book-Classification ISBN Numbers & the Current Limits of MAMLMs
My view: MAMLMs—Modern Advanced Machine-Learning Models have significantly less in the way of parsimonious truth-based internal representations of the world than their boosters claim...
My view: MAMLMs—Modern Advanced Machine-Learning Models have significantly less in the way of parsimonious truth-based internal representations of the world than their boosters claim...
Ezra Klein writes:
Ezra Klein: ‘“I have a child who is also 2e and has been part of the NYC G&T program”, the nonsentient chatbot wrote to a group of human parents. “We’ve had a positive experience with the citywide program, specifically with the program at The Anderson School”. <404media.co/faceb…> I keep thinking about this. Meta's AI doesn't know it's lying. It is not trying to deceive. Its relationship with truth has nothing to do with ours.I think talking about "artificial general intelligence" misleads us. The capacities here might be general—they may become superhuman!—but this isn't intelligence rooted in the world, intelligence that knows when it is speaking honestly and when it is confabulating. It is deeply alien…. I don't think we are nearly ready for a world thickly populated with completely convincing bullshit… <https://www.threads.net/@ezraklein/post/C6B12qtuvo_>
I think Ezra is approaching this wrong: Facebook’s ChatBot is claiming to have a child in the NYC G&T program because that is what it has been trained to say by the RLHF that has tuned it. It could have been subjected to a different kind of RLHF, which would then have produced a result like:
The process of scanning the internet and my subsequent training leads me to calculate that a likely response by a human, if this question were asked of a human on a discussion forum on the internet, might well take the form of: “I have a child who is also 2e and has been part of the NYC G&T program. We’ve had a positive experience with the citywide program, specifically with the program at The Anderson School”…
Simply prefix every ChatGPT4 and related GPT MAMLM ChatBot answer with “The process of scanning the internet and my subsequent training leads me to calculate that a likely response by a human, if this question were asked of a human on a discussion forum on the internet, might well take the form of…” Do that, and most of Ezra Klein’s talk about models that “may become superhuman!—but… not intelligence rooted in the world… that knows… when it is confabulating…” vanish away.
Think of it. The real difficulties with earlier models were (a) parsing natural-language inputs and (b) providing full-scope answers. If answers can be channeled into relatively few categories, and if input can be restricted to a small vocabulary of action words combined with modifier terms, then previous generations of models could seem equally as “sparks of AGI”. Google could have put a faux-AGI wrapper around Google Search, for example. Start each answer with: “The process of scanning the internet and my subsequent training leads me to calculate that the ten web pages that might perhaps be of the most use to you are…”
And consider what resources ChatGPT4 is currently throwing at the problem. I have been told by people who ought to know that ChatGPT4 is burning $0.05 of electricity per query-answer. Electricity costs about $0.05 per kWh. That’s a full kWh per answer, delivered in perhaps a little less than a minute—figure 50 kW in electricity, which might make sense if the total electricity drain of the system focusing on creating an inference response right now to a query were enough to power 50 NVIDIA A100s.
The human brain runs on 50 W—one-thousandth as much, unless my guesses are way, way off. On any using-language-to-understand the world metric, those 50 W massively outperform what can be done by a thousand times as much power in crystalized, doped sand trying to think.
Not that the accomplishments of ChatGPT4 are not amazing. But they are, to put it politely, very uneven for something that presents itself as as fluent in English understanding and comprehension and as confident and authoritative in its answers as it is.
Consider a question I was asking ChatGPT4 yesterday:
Q: What other novels are in the same sequence as Sarah Monette’s Melusine?
ChatGPT4: Sarah Monette’s novel Melusine is part of a four-novel sequence called “The Doctrine of Labyrinths.” The other novels in this sequence are:
The Virtu (2006)
The Mirador (2007)
Corambis (2009)
These novels continue the story of Felix Harrowgate and Mildmay the Fox, exploring their complex lives and the magical, intricate world Monette has created.
Pretty good, no?
Now the sentence “What other novels are in the same sequence as Sarah Monette’s Melusine?” did not (before this) exist on the internet. Or so says Google Search—it returns zero hits for this exact sentence. Google Search does return as relevant 8 hits to pages with titles
“Sarah Monette Books In Order” (2),
“Sarah Monette” (2),
“Melusine” (2),
“Melusine Series in Order”, and
“‘Doctrine of Labyrinths’ series”.
I suspect that ChatGPT4 nails this because one of the pages the Google search returns is the Sarah Monette Wikipedia page, which contains the sequence:
Doctrine of Labyrinths series —- (2005). Mélusine (1st ed.). New York: Ace Books . pp. 1–432. ISBN 978–0441012862. —- (2006). The Virtu (hardcover ed.). Ace Books. pp. 1–439. ISBN 978–0441014040 —- (2007). The Mirador (hardcover ed.). Ace Books. pp. 1–426. ISBN 978–0441015009. —- (2009). Corambis (hardcover ed.). Ace Books. pp. 1–421.[ISBN] 978–0441015962.”
And ChatGPT4 gloms onto this page.
But how much of the Wikipedia page was ChatGPT4 actually able to internalize in its training? In at least one sense, disturbingly little for something that presents itself as so fluent in English and so knowledgeable.
Let’s continue the conversation:
Q: What is the ISBN number of “The Virtu”?
ChatGPT4: The ISBN number of “The Virtu” by Sarah Monette is 978–0441014190.
This is not so good. The true ISBN for “The Virtu” is, as the Wikipedia page says, 978–0441014040.
In fact, there is no such ISBN—the identifier ChatGPT4 delivers fails to deliver the last-digit checksum. Change the checksum to pass—replace the final 0 with the checksum-consistent 4—and we discover that ISBN 978–0441014194 is the ISBN for The Tainted by Glenda Larke.
If it were to get the checksum right, ChatGPT4’s training would somewhere within itself have to represent:
‘Virtu’ + ‘ISBN’ → ‘978’ + ‘-’ + ‘0’ + ‘441’ +’01410’ + ‘0’
This would be the way that ChatGPT4 would “understand” that the ISBN number for the (hardcover) version of “The Virtu” is composed of:
978 (it’s a book!)
- (separator)
0 (from an English-language publisher)
441 (to wit, Ace Books)
01410 (and this is the separate identifier for the hardcover edition of “The Virtu”)
0 (finally: checksum)
Let’s see to what extent ChatGPT4 knows these things:
Q: What is the numeric prefix ISBN element for a book? Be concise. A: The numeric prefix ISBN element for a book, known as the GS1 Prefix, is either “978” or “979”
Q: What is the numeric code ISBN element for a publication in an English-speaking country? Be concise. A: For English-speaking regions, the group identifier is usually “0” or “1”.
Q: What is the numeric code ISBN element for a publication by Ace Books? Be concise. A: The numeric ISBN publisher code for Ace Books, which is a part of the Penguin Group, is typically “441” or “0441”.
Q: What is the numeric code individual book ISBN–13 publication element for the hardcover edition of the book “The Virtu” by Sarah Monette? Be concise. A: The ISBN–13 publication element for the hardcover edition of “The Virtu” by Sarah Monette is 978–0–441–01397–9.
Three right answers and then one wrong one. We expect an answer of 01410. We get 9780441013979. 9780441013979 ≠ 01410. And, once again, 9780441013979 is not a valid ISBN. The correct checksum digit is not “0” but rather “5”. ISBN–13 9780441013975 is for “Balance of Trade: A Liaden Universe Novel”, by Steve Miller and Sharon Lee.
So while it “knows” the book prefix, the English-speaking group identifier, the Ace Books registrant ID, it does not know that “01410” belongs to the (hardcover edition of) “The Virtu”. And if it is not explicitly asked to calculate the checksum, it does not know that the last digit of an ISBN is a checksum.
It is at this point that MAMLM-are-the-road-to-superhuman-AGI—Modern Advanced Machine-Learning Models are the road to superhuman Artificial General Intelligence—boosters say “scaling laws!”
ChatGPT4 has 1500 dimensions along which to map human language with its 175 billion parameters. All we need is another 10x increase in each dimension of model size. Which would drive the electricity cost of answering questions to 10 x 10 = 100 times as much = $5/minute = $300/hours, which is of the same order of magnitude as I would charge. I would say: we are going to need more than just “scaling” to get to anything truly wonderful. (More truly wonderful: ChatGPT4 is wonderful!) But people like Dario Amodei say:
Dario Amodei: Interviewed by Ezra Klein: ‘OpenAI… 2018… GPT-1…h used something like 100,000 times less computational power than the models we build today…. We found these very smooth [scaling-law] patterns… projection[s] that… [with] $10 billion [spent] on these models, instead of the $10,000 we were spending then… all of these wondrous things would happen…. The underlying technology… a smooth exponential… can be eerily preternaturally predictable…. I would say… three to 18 months… [is]s going to change how people think about A.I…. [Already] the models, despite the fact that they’re wrong about a bunch of things… often perform better than the average of the data they see…. In the case of arithmetic… some underlying algorithm [is] used to add the numbers. And it’s simpler for the [neural network] models to hit on that algorithm than it is for them to do this complicated thing…. This connects to things like Occam’s razor and simplicity and parsimony in science. There’s some relatively simple web of truth out there in the world…. All the true things are connected... <https://www.nytimes.com/2024/04/12/podcasts/transcript-ezra-klein-interviews-dario-amodei.html>
Amodei is thus inverting Plato’s Parable of the Cave: LLMs already—he says—see the shadows cast on the walls of the patterns-of-language cave, and are already using them to build representations of the actual three-dimensional reality of the objects casting shadows inside their neural networks. The standard example is that they “know” things like: king + (woman - man) → queen.
But we do know that all scaling laws and processes have limits.
And we do know that 1500 dimensions of classification from and 175 billion parameters of neural network-conducted classification of phrases and construction of a flexible metric of nearness-in-meaning—that those are not enough for ChatGPT4 to “learn” enough to “understand” ISBNs.
Appendix:
And then there is the question about ISBNs to which ChatGPT4 gives its longest answer:
Q: What is the checksum digit that needs to be added to “978–044101404” too make it a valid ISBN–13 identifier? Be concise.
A: To calculate the checksum digit for an ISBN–13, we use the following formula:
Take each digit of the ISBN, from left to right.
Multiply each odd-positioned digit (from the left) by 1 and each even-positioned digit by 3.
Sum all the products.
Find the remainder when this sum is divided by 10.
Subtract this remainder from 10 to get the checksum digit. If the remainder is 0, then the checksum digit is 0.
Let’s apply this method to “978–044101404”:
Ignore the hyphens for calculation: 978044101404
Add an unknown checksum digit x: 978044101404x
Now, calculate:
Sum=(9×1)+(7×3)+(8×1)+(0×3)+(4×1)+(4×3)+(1×1)+(0×3)+(1×1)+(4×3)+(0×1)+(𝑥×3)
Sum=9+21+8+0+4+12+1+0+1+12+0+3𝑥
Sum=68+3𝑥
To find the checksum 𝑥x, solve:
Final Sum=68+3𝑥≡0 (mod 10)
Let’s calculate the value of 𝑥x that satisfies this equation.
The checksum digit that needs to be added to “978–044101404” to make it a valid ISBN–13 identifier is 0.
Thus, the complete ISBN–13 is “978–0441014040”.
References:
Klein, Ezra. 2024. “‘I have a child…”. April 21. <https://www.threads.net/@ezraklein/post/C6B12qtuvo_>
Klein, Ezra, & Dario Amodei. 2024. “Transcript: Ezra Klein Interviews Dario Amodei”. The Ezra Klein Show. April 12. <https://www.nytimes.com/2024/04/12/podcasts/transcript-ezra-klein-interviews-dario-amodei.html>.
Monette, Sarah. 2005. Mélusine. New York: Ace Books. <https://www.worldcat.org/title/melusine/oclc/60513151>.
Monette, Sarah. 2006. The Virtu. New York: Ace Books. <https://en.wikipedia.org/wiki/The_Virtu>.
Monette, Sarah. 2007. The Mirador. New York: Ace Books. <https://en.wikipedia.org/wiki/The_Mirador>.
Monette, Sarah. 2009. Corambis. New York: Ace Books. <https://www.worldcat.org/title/corambis/oclc/233548300>.
Wikipedia. 2024. “Sarah Monette”. Accessed April 21. <https://en.wikipedia.org/wiki/Sarah_Monette>.
I had a funny hallucinating event with ChatGpt 4. This event illustrates an interesting property of a clueless child.
To set the stage, you can ask ChatGpt to rewrite something in the style of another writer or a style of a different age.
So, you an ask ChatGpt to rewrite "Mary Had a Little Lamb" in the style of Ernest Hemigway and get: "Mary owned a lamb. It was white. It followed her. Even to school."
Then I asked ChatGpt to rewrite "Mary Had a Little Lamb" in the style of the King James Bible, and got the following first line of the rewrite:
"Mary begot a lamb"
No, Mary did not give birth to a lamb,
even though, in the rest of the poem, the lamb acted like a child.
Where does electricity cost $0.05/kWh? I pay about $0.21/kWh to have it delivered to my house!