Discussion about this post

User's avatar
Alex Tolley's avatar

Some very interesting observations.

What should we call this technology? IMO, we should use easy to recall names, not acrobyms. Just as rule based models with coded knowledge were called "expert systems", or rules created using data "decision trees", I would use the term "language systems" or "language Model".

IMO, these language models are replicating Kahneman's "thinking fast" (System 1). System 1 allows fluid verbiage that flows without thinking. For example, when I used to mimic Robin Leech's verbiage on the lifestyles of the rich, I had no idea what I was going to say next, I just let the words flow. In a similar vein, for those with dual or more languages, stress may revert the structure of spoken English (2nd language) to that of the native language, that may be more deeply embedded in the cortex.

Idk about "laws of Thought", but I have been playing with testing Chomsky's innate grammar with ChatGPT (3.5) and I think it may invalidate it. Our language acquisition may be purely mimicry, ie learning by example, just like langyage models.

Will language models change society? Historically, we built systems and machines to carefully reproduce a consistent result. From draught animales turning a wheel, to powered machinery weaving cloth, to factory systems turning out exact replicas of objects. These muscle substitutes hugely enhanced our societal productivity. Computers when used in association with these tools, like controlling robots similarly do so. But computers used a "bicycles for the mind" don't just speed teh journey from A to B, but allow exploration and taling mental journeys far further afield. This is why computer software like spreadsheets, word processors, and so forth, do not reduce employment, but expand it. It is a mental equivalent of Northcote's Law - mental work will expand given the allowed time to complete it.

Language models have, however, a flaw, that you demonstrate at teh outset. They are not like software to control machines to replicate outputs, but instead, just like probabilistiv Markov models. The output will vary with small changes in input (prompts). This suggests to me that unless this can be fixed, language models are best used where accuracy is not needed. For example, software algorithms must be accurate and not break, and this is tested with QA methods, especially to detect "corner cases". (Expert Systems and Decsion Trees are "brittle" when unexpected data "broke" the rule structure".) So language models work well to create drafts of text, images based on similar approaches, and other creative applications where accuracy is not important. Where they fail is where accuracy is needed, such as your citation builder example.

Just as software engineers end up in cycles of reiterating code to meet the non-expert verbal instructions - "That is not exactly what I meant. can you do [X]", language models need to be able to handle recursion of prompt instructions. But just as the complexity often results in "I could do this faster than repeated instructions to subordinates", so I think language models will not be productivity enhancing without help. Can they be induced to attempt Kahneman's "thinking slow"? Idk. What I do think is possible is that they be integrated with existing software that does do the task effectively. In your example, there are many citation builders available. Some can guess at teh correct citation and output directly from the database of content using just the title. If that fails, then inputting details into the input fields solves the problem. Integrating a language model with an existing citation builder would likely produce accurate output and would be productivity enhancing. If you need real math done, integrate with Mathematica to build the model and test inputs. Once a correct version is built, "fix" that model for future requests, rather than teh language model building it from scratch each session. Similarly, using texts, go through teh selection process of texts to use with teh language model, and fix those texts in a database, so that requests of information from texts always uses those texts alone. It should not try to build a language model on those texts, but extract exact pieces from those texts to build a precis, or build an argument, "for and against" an assertion to be tested. Just as computers and software do not reinvent basic operations on numbers, so language models should operate using existing algorithms where accurate results are needed, and used their creative, sloppy, responses to convert the output to human language if needed.

Expand full comment
Thomas R Howell's avatar

"ChatGPT4 “identifies” and slots lastname, firstname, title, website, and URL into the format. It does all of these correctly."

Its subtitle is "Information Technology and the Future of Society." Your original subtitle is "Information Technology in the Service of Society"

Expand full comment
17 more comments...

No posts