The Ouroboros Of Machine Learning

When machines start eating their own vomit

ouroboros serpent around the head of the mummy
The second gilded shrine of Tutankhamen, via

Ever since I learned the word 'ouroboros' (a snake eating its own tail) I see the creature everywhere. When all the cars pull into a junction and block each other. When Disney releases another remake. When AI learns from 'Search Engine Optimized' content to make more SEO content, which is then used to train more AI. That's the Internet eating its own 'long tail'. An ouroboros if I've ever seen one.

Said The Spider To The Fly

SEO is humans trying to think like a machine (Google), and write what the machine wants to read. Now we have AI reading this already mechanical content and vomiting out their own SEO content which gets fed back into the Search Engine. Like a dog returning to its vomit. We have to really question what sort of 'engine' we're building here. Sounds like a perpetual bullshit machine.

When the World Wide Web (which I use interchangeably with Internet) started, the idea was that humans would learn HTML and make their own webpages. And, at the beginning, many people did. Not that many people, however, because learning even simple code was beyond most people's ken. The web became the domain of corporations (.coms), including corporations that made it easier for humans to publish on their 'platforms'. We traded freedom for a little convenience and thus ended up with neither.

Today very few people publish webpages directly and publish on platforms or within apps. The vast majority of this content is behind logins and unsearchable by default. The tiny fraction of the Internet that is indexed is largely monopolized by one 'platform' which is Google Search. Everybody else is fiercely competing over essentially one page, the first page of Google. Thus the World Wide Web—which was meant for anyone to talk to anyone—ends up being a bunch of corporations trying to talk to one corporation, while most humans have fucked off behind walled gardens.

The web is now literally ruled by spiders, and humans are just flies to be caught there and monetized. Google Search increasingly sucks for finding anything. It's full of corporations trying to find you instead. Personally, I can't find anything on the Internet anymore. Everything is either an algorithm trying to anticipate what I want, or an optimization, trying to research me before I even get started.

Adding AI To The Fire

AI is theoretically a solution to this problem. AI burns a small forest to run a query, but it's able to answer questions, which Google increasingly cannot. The answers are confident bullshit, which is one problem, but there's the other problem. Where do the answers come from?

Machine learning requires 'data' to learn anything, which it's currently hoovering up from the human Internet. That's fine, I guess, but then the same SEO companies use AI to vomit out even more content. Thus, in the next generation, the AI consumes its own vomit. A few generations of this and what do you have? Garbage in, garbage out. What we have here is loop of burning garbage.

At some point, you end up with an Internet with lots of machine-generated content, fed back into machine learning models, forming a perfect ouroboros of uselessness. A problem of too much information is not solved by even more information, but that doesn't stop people from trying. Indeed, all the incentives are there to run this garbage loop until it becomes a complete dumpster fire.

Increasingly, I don't even use search anymore, and I stopped using ChatGPT at all. I've gone back to what I learned in grade school, which was to walk over the library, find three books about the subject, and use that for my 'papers' (I am one of the few people who does exactly what we were trained to do in school).

If I need to keep up on anything I follow people I directly trust (or who are honestly wrong) on Telegram or email, all of which is broadly unsearchable. If I need to 'find' anything it better be in my bookmarks or I better just remember it, otherwise it's gone. More than anything I've been exercising that most ancient of search engines, my brain. It's a bit rusty, but it works.

What I do now is read the same books over and over, or repeat myself, or try to use rhyme, the classic mnemonic systems. It baffles me that the truest things in the world would coincidentally rhyme, but they somehow do. This is certainly less information, but it's better information, which is the point. As the Buddha said, “Even if there be a thousand verses—a mere collection of useless words—one verse is better having heard which one is pacified.” I found that in a well-worn copy of the Dhammapada, which I cannot find online and can barely find copies of. But it still exists. And it's still good. It's outside of the infinity loop we call the information superhighway, but that's increasingly a garbage loop, as we've discussed.

One wonders what the concept of a search 'engine' even means once you start feeding more and more SEO optimized content into it, and then even more AI/SEO content. Up to a point, this works as a turbocharger, but after that point the engine is just burning garbage. We seem to be trying to build a perpetual motion machine, which never works, but I guess that doesn't stop people from trying. As long as it's profitable, the ouroboros will keep eating its tail, and there's a long tail to consume. Disney will keep making remakes, record companies will keep making remixes, and the search engines will keep returning to their own vomit until they finally choke on it.