Discussion about this post

User's avatar
Kaleberg's avatar

I think the problem is that LLMs are less than meets the eye. Apple got punked. LLMs are very good at impressing the uninformed and that includes upper management. As Cory Doctorow noted, they don't have to be good enough to do your job, they just have to be good enough to convince your boss that they can do your job.

It's very easy to imagine a v1 that was tuned to work on one battery of tests to the point where the engineering management involved was convinced that just another six to twelve months of tuning would get something working more generally. Unfortunately, programming LLMs is not like writing code. There are lots of people with a vague sense of how far along software is and how close to meeting goals. They're usually wrong in detail, but often close enough to estimate ship dates. The joke is that the best way to improve such skill is to multiply your first estimate by your age.

LLMs are another matter. These systems are not transparent and not robust. Remember when the big thing was tricking computer vision systems by putting small stickers on stop signs to convince them that a firetruck was blocking the way. LLMs are all too similar in operation, so v1 was a lot farther from release than anyone at Apple thought. They found out the hard way as so many others will.

Expand full comment
James's avatar

A day late but here are my thoughts about what Apple is undertaking in trying to get Siri up to snuff and why it’s a hard problem to solve:

1) Apple doesn’t just want a reliable LLM interface with their products, such as what Microsoft is doing with copilot. They also want a reliable voice interface and that’s a whole other set of challenges that no leading AI company has solved yet.

2) Apple wants their AI enhanced Siri to work *offline* and that’s a significant challenge. Perhaps even more difficult than they were expecting.

2a) Using any generative AI at full capability right now requires an entire farm of servers with thousands of nvidia chips networked in just the right way with full access to reams of training data. That’s a lot of overhead to make every little query and comment work.

2b) You can run simplified AI models locally on your PC but it requires you to have an expensive nvidia GPU, often a part that consumes more than 300wats under load and has a physical footprint measured in tens of inches/centimeters. On a single chip with a smaller set of data, capabilities are extremely… modest.

2b1) I mention nvidia chips because nobody has an alternative to their chips now or in the immediate future. Apple’s chip designs are amazing but they were not developed for the work of generative AI. Running a local LLM on your iPhone just isn’t possible with apple’s current chip architecture and they can’t simply copy nvidia because, yes, patents but also because nvidia is not efficient enough for mobile, and because Apple doesn’t want to be dependent on anyone else.

3) the networking required to synchronize an LLM’s operations across a server farm and a local device with different chips and different capabilities and different latencies and data lost to cellular or WiFi conditions is a nontrivial challenge but Apple’s vision for AI Siri requires solving that too.

4) we haven’t even gotten to the software running Siri and all the little agentive interfaces into every app and program on your iPhone or your Mac.

So, Apple basically needs to be the first company to solve several different but related hard problems in order for Siri to function in the way they envision. They need an in-house AI chipset that is efficient enough for mobile because Siri needs to be able to do some “thinking” on your local device, potentially offline. That chipset, when online, needs to function well with the large server farms, something nobody else is trying to do at the moment afaik. There’s a reason all of the compute is being done in server farms and not distributed across everyone’s individual devices! Oh, and they’ve got to get a class-leasing voice interface off the ground so Siri properly understands people of varying accents, languages, and blood alcohol content.

I have no idea why anyone at Apple thought they would solve all of this in a matter of 12-18 months! I suppose they underestimated the challenges involved.

Expand full comment
8 more comments...

No posts