9 Comments

> Thus I believe that, given the evolving information ecology in which our undergraduates are already immersed, we have a very strong moral obligation to do what Josh Gans and Kevin Bryan and company at All Day TA are doing—to attempt to train the MAMLMs that are students are goig to consult over the next semester as the first-line answerers of their questions, and train them to be as high-quality as possible:

I hope it's possible, but prima facie it looks like a "nerd harder" strategy. Training MAMLMs to be accurate reliable question-answerers is not something we know how to do; it's true that with care a tweaked version of an existing one will be better than the un-tweaked version, but "better" isn't "good enough." (Or: if they can do it for this they can do it in any number of more profitable contexts...) Surely universities have always tried to teach students to use good sources instead of attempting to marginally improve bad ones?

To put it in another way: it looks like, to my chagrin, students might spend all of their education and perhaps professional lives working with MAMLMs; most of them will have been developed without much care or even good intent. Isn't it best for them to learn this and ways around this as early as possible?

On the other hand, I'm saying this from the quite comfortable position of not being a professor, TA, etc, having to deal with the messy reality of actual student behavior and consequences. Maybe harm reduction while something better is being worked on is what the situation calls for? I acknowledge the generational importance of the problem & I'm exceedingly glad of it not being on my plate.

Expand full comment

How much of the "training data" being fed to large language correlation models has been paid for through the various copyright/patent/IP mediation & payment organizations, or directly to the content creator?

Expand full comment

Effectively zero... -B.

Expand full comment

I don't recall overall numbers, but as it the core of it was "a crawl of the Internet" my rough estimate would be that they paid zero for the large majority of it, and are paying a small bit to platforms (not creators) for some specific incremental feeds (like some new orgs, Reddit-style platforms, etc).

Expand full comment

So - massive violations of copyright law, published licensing agreements, etc. Great way to create a new industry!

Expand full comment

Not so fast. If you go to a gallery and like an artist, you can emulate their style or technique to produce original works without violating copyright as long as you don't make obvious changes to an existing work. This applies to literary and musical works too. Copyright court cases are based on whether the new work is too close to the original. As with all art, this is subjective subjective.

GenAI can be very creature or not so creative depending on the settings of control variables and the input prompts.

Expand full comment

This is off-topic relative to the content of your post, but I do recommend reading Tom Shippey's LRB review of "The World the Plague Made", by James Belich (https://www.lrb.co.uk/the-paper/v46/n21/tom-shippey/blame-the-gerbils). The book (which I have not read) contains the sort of "big idea" about historical development that interests you, for example:

"[Population recovery] was once thought to be relatively rapid, taking perhaps a century, but that now seems another underestimate. England did not return to its pre-plague population until about 1625, 280 years after the first strike. During most of that period Western Europe had about half the population it had in 1345. And yet 1400-1500 ‘is the very century in which Western Europe’s global expansion began’, the period of what has been called ‘the Great Divergence’ between Europe and the rest of the world. ‘The Black Death and the Rise of Europe’, as Belich’s subtitle has it, do seem to be linked in time, and it may not be a coincidence."

Expand full comment

Could be. The plague triggers a shift in the economy's Malthusian equilibrium to a less-populous, richer configuration... -B.

Expand full comment

That's exactly the argument: less labour to serve the same productive assets caused an increase in productivity, because of selection on the productivity of the assets. This increased the disposable income of farmers at the same time as, paradoxically, labour was released because fewer people could produce more food per capita. It also raised the value of better technological inputs (e.g. wind and water power), and argument that has been made before with respect to England. There are various facts adduced to support these claims. Belich explicitly does *not* claim the plague as the *main* driver of European development, but thinks it is the main "missing" or underappreciated driver.

Expand full comment