The History of NLP
How AI Learned to Talk: From Zork to Autocomplete. 🔗
That’s Natural Language Processing, not the other NLP. :)
Z-Games 🔗
If you’re very old or indulge in ancient novelties, you might’ve played Adventure, the direct inspiration for the most famous text-based game of the modern computing era: Dungeon, or Zork. It ported to the Z-Machine format and ZIL (an early DSL) by Infocom, who released a number of similar games, including the Hitchhiker’s Guide to the Galaxy.
Around this time, computers were starting to be office desk sized - personal - and finding their way into homes. Games took off in a couple directions, one being the graphics + text style Sierra Games (like the Leisure Suit Larry series), and the other stayed text-based, and became myriad MUDs, MUCKs, and other MUs games on the simultaneously-blossoming internet. A tick later and we had online graphical games like Ultima Online to compliment the text-based, and from there it just kept going.
How Llms Work
Demystifying LLM Architecture 🔗
“Oh, there is a brain all right. It’s just that the brain is made out of
meatmath!”“So… what does the thinking?”
“You’re not understanding, are you? The brain does the thinking. The
meatmath.”“Thinking
meatmath! You’re asking me to believe in thinkingmeatmath!”
Aka: The documentation I wanted but couldn’t find in one place anywhere, and will want as a reference for myself ongoing also.
How many FLOPs
Why are LLMs so Power Hungry? 🔗
I knew every token got np.dot()’d over every vocab (token) array of embeddings-length floats etc, but I was curious just how many calculations an LLM does per token, on average. I asked GPT4o, then Grok. Their answers were essentially the same, but Grok’s is a bit more detailed, so I’m pasting it in full here just because it’s a pretty intense read and I’m sure I’m not the only one that’s wondered - so I’ll save you the trouble of futzing around and comparing notes to find it. :)