Are at Least Some of Us About to Replace "Programming" wiþ "Invocation"?
The SubTuringBradBot project; should I be thinking about what the right LLM "Grimoire" would look like?...
The SubTuringBradBot project; should I be thinking about what the right LLM "Grimoire" would look like?...
Part of: SubTuringBradBot…
The elective affinity in thought between programming and post-1000 western European conceptualizations of magic is old and strong.
For example:
Paul Dourish: THE ORIGINAL HACKER'S DICTIONARY: ‘WIZARD n. 1. A person who knows how a complex piece of software or hardware works; someone who can find and fix his bugs in an emergency. Rarely used at MIT, where HACKER is the preferred term. 2. A person who is permitted to do things forbidden to ordinary people, e.g., a "net wizard" on a TENEX may run programs which speak low-level host-imp protocol; an ADVENT wizard at SAIL may play Adventure during the day...
And:
Harold Abelson & Gerald Jay Sussman with Julie Sussman (1993): Structure and Interpretation of Computer Programs: "Computational processes are abstract beings that inhabit computers… manipulate other abstract things… data… [their] evolution… directed by a pattern of rules called a program. Programs… conjure processes… like a sorcerer's spells… [are] carefully composed… in arcane and esoteric programming languages that prescribe the tasks…. Our processes… execute… precisely…. Thus, like the sorcerer's apprentice, novice programmers must learn to understand and to anticipate the consequences of their conjuring…
This metaphor and analogy has been quite popular.
On the one hand, it is clearly a joke. On the other hand, it is an example of: “ha, ha: only serious”.
Now it looks like this metaphor and analogy may be taking another upward leap:
Ethan Mollick: Now is the time for grimoires: ‘It isn't data that will unlock AI, it is human expertise…. With… Large Language Models]… I… think… the most useful thing… in this AI-haunted moment [is] creating grimoires, spellbooks full of prompts that encode… our hard-earned expertise in ways that AI can help other people apply…. Expertise…. Time with AI models…. [And] a vision of what you want the prompt to do that is focused and achievable…. What I would really like to see is large-scale public libraries of prompts, written by known experts and tested carefully for different audiences… freely available…. And robust discussions around these prompts…
The question is: Is this metaphor/analogy more illuminating or confusing?
And now it is time for me to get back to the SubTuringBradBot project—what I think is my best idea to try to make modern Transformer LLMs and other LMLMs useful to me in my life. The image stuff—I have a minor use for that. The slide-making stuff and such—that may be useful. But the problem is that I regard myself as a better writer than Chat-GPT and a much better internet searcher and results presenter than BingAI-Chat. As far as the main competences of Transformer models so far are concerned, things go more smoothly when they are not in my loop.
But can they be useful to me in cases that allow me to withdraw myself from the loop completely? I would like to find out. Hence my SubTuringBradBot project
that I have been sporadically writing about. Its aim is to figure out if a properly-trained and -tuned LLM can do any or all of the following useful tasks:
Substitute for my book Slouching Towards Utopia: The Economic History of the 20th Century in the sense that having a two-hour conversation about the book with a ChatBot can do as much good for someone’s durable knowledge about the economic history of the 20th century as spending two hours reading the book.
Substitute for office hours about my book Slouching Towards Utopia: The Economic History of the 20th Century in the sense that having a fifteen-minute question-and-answer session about the book with a ChatBot can do as much for a puzzled student with fairly basic questions as can having a fifteen-minute office-hour conversation with, well, me.
Provide questions and answers for a useful test to help see if people assigned Slouching Towards Utopia have learned (much) from the book.
Provide questions and answers to help students review the ideas in Slouching Towards Utopia as part of an “active learning” process to cement its contribution to their intellectual panoply.
Generate thought-provoking questions and discussion prompts that can serve as seminar topics for classes on Slouching Towards Utopia.
Search out links to additional relevant resources for thinking about questions raised in and by Slouching Towards Utopia.
The problem is that, so far, at least, it is not quite there. It is much more articulate than it is accurate and insightful. And the byproduct of its being articulate is that the answers it gives are a combination of:
answers that I would give,
answers that a high school senior, who has my book at hand, but does not really understand it. would give
answers that are the general opinion of the internet.
One important reason that this is not good enough is that it is so damned articulate. It speaks with such authority and confidence. A “here are some paragraphs that might be relevant" ChatBot might actually be more useful.
So this fall I want to sic some people on the problem of making SubTuringBradBot of acceptable quality.
Some theses I tentatively advance as I start this project:
I am less optimistic than I was last March, when I [ed Charles Maier’s brand-new The Project-State and Its Rivals: A New History of the Twentieth and Twenty-First Centuries <https://www.amazon.com//0674290143>, I fed the book text to GPT and then began asking questions of MaierBot. I found that it was like talking to a smart freshman who read the book closely and remembered it well even though they clearly did not completely understand it. Deep questions led it to metaphorically throws up its virtual hands. But I concluded that Charlie Maier could turn the text from the ChatBot into the equivalent of an interview, and do so in maybe 1/5 the time it would take all-told to do the interview. I stand by that. But the last mile of getting the human out of the loop for even simple conversations is really hard.
It is, after all, just a Chinese-Speaking Room: manipulating symbols according to rules. It responds to a question by: (a) constructing a metric of page similarity; (b) using that metric, find the pages from Brad DeLong’s Slouching Towards Utopia that are the most similar to the question; (c) construct a “prompt” by adding those most-similar pages to the question; (d) calculate the successor pages to the “nearest neighbors” to the enhanced question that you can find in the training data; (e) average those successor pages in some way, and return that average as your answer. The gap between that answer and what my answer would be is… rather large.
ChatBots are weird. What we think are text chunks that are “close to” each other in meaning are not always page chunks that it thinks are “close to” each other in meaning. And the way thing is constructed it is not possible, for me at least, to figure out why its map of meaning similarity is different from mine.
I think the gap will stay large except in limited domains. Where the overwhelming bulk of training material has provided lots of correct, positive exemplars of answers that are successor pages and few incorrect, negative exemplars—it does very well. Thus you need to tickle it so that it lands in that corner of the vector space. Otherwise? Otherwise, it will act like a high school senior who more than half-groks the patterns of symbols, but does not really understand.
The enormous computational resources burned to create the LLM-ness of this form of human-computer interaction does not include with it any true “reasoning” capabilities at all. It is Stochastic Parrotage. It sounds as human as it does because most of our speech is also Stochastic Parrotage. My further suspicion is that just feeding it a book will not be good enough, precisely because it does not have true reasoning capabilities.
Picking up what a desired Q-&-A would look like from a linear book text is really not within its conversation-simulation scope.
But if you were to feed it a catechism—something primed well to associate questions with desired answers—might it produce something that would pass?
What other directions should I also try to have my team explore this fall?
And I cannot resist: From The Lord of the Rings, Gandalf, flustered, trying to talk shop with people who have no idea what he means:
I found myself suddenly faced by something that I have not met before. I could think of nothing to do but to try and put a shutting-spell on the door. I know many; but to do things of that kind rightly requires time, and even then the door can be broken by strength. As I stood there I could hear orc-voices on the other side: at any moment I thought they would burst it open. I could not hear what was said; they seemed to be talking in their own hideous language. All I caught was ghâsh: that is “fire”.
Then something came into the chamber—I felt it through the door, and the orcs themselves were afraid and fell silent. It laid hold of the iron ring, and then it perceived me and my spell. What it was I cannot guess, but I have never felt such a challenge.
The counter-spell was terrible. It nearly broke me. For an instant the door left my control and began to open! I had to speak a word of Command. That proved too great a strain. The door burst in pieces. Something dark as a cloud was blocking out all the light inside, and I was thrown backwards down the stairs. All the wall gave way, and the roof of the chamber as well, I think…
Re: : Now is the time for grimoires
Hasn't Charlie Stross' Laundry novels occupied that space for quite some time now? Bob Howard is a master at this, although others seem ever more powerful with the techniques.
As an MIT trained CS major myself, I find that it helps to remember that any magic sufficiently advanced is indistinguishable from technology.