inconsequence
musings on subjects of passing interest
Contrary to popular belief, AI may be peaking

On the one hand, they're running out of human-generated text to train on. On the other hand, they're thinking of mining the text in online videos and using "synthetic content" to get more data.
In 2005, if you showed someone who hadn't used Google how good Google was and said, "and it will only get better over time", they would have been blown away and nodded numbly in agreement.
Instead we got SEO-hacking and Google AdWords.
As with most things tech, including the flaws, I'm an early adopter. Back in the late 90s when my father was diagnosed with prostate cancer, I searched the internet for information of prognoses and cures. Literally 90% of the results mentioned "proton beam therapy", and they all seemed pretty reputable. I think Altavista found something like 50,000 web pages on it.
Proton beam therapy is a real thing, and while there are pages devoted to it on most big health sites, it's not even explicitly mentioned in the Mayo Clinic's page of prostate cancer treatment). A lot of the time, it doesn't really need to be treated.
If it's valuable, it's worth perverting
The reason 2005 or so was "peak google" was that Google started out as an end-run around the sort of "SEO-hacking" that had been successful before Google existed. Originally, search engines tended to trust that websites were about what they said they were about. Then they got clever and started looking at the text on the site.
So, in the late 90s, a search engine assumed that if the keywords for your web page (which you created) included "best" "prostate" "cancer" "treatment" then, to quote Stefon, "look no further".
A more sophisticated search engine might actually decide to see if the words on the page also contained those words. And that was about it. It was common practice in the mid aughts to include click-bait text in invisible divs on pages which tricked most search engines.
Google, on the other hand, basically tried to figure out if sites that seemed to be about X pointed to your page, which suggested your page, too, was about X. And then the link-farming and other hacks began, and combined with the business model Google chose to adopt (inserting sponsored search results assigned by auction above actual search results), Google got worse and worse.
AI is just as hackable, and it's already happening
Ignore the placeholder image! This video is directly on point (and discusses the article Adobe’s Generative AI Jumps The Shark, Adds Bitcoin to Bird Photo).
AI's generative tools used to work pretty well most of the time, but they're starting to hallucinate. In the linked example, a photographer deleted an ugly highlight and Photoshop inserted a bitcoin logo. Maybe this is intentional content-flooding by someone, or maybe it's unintentional. The fact is that AI-generated content produced with ill intent has infected the AI Adobe presumably pretty carefully husbands.
I see evidence of unintentional hacking all the time. When software platforms get successful, there's a network effect. If you want your rank on Stackoverflow to go up (for some reason) then answer questions on this framework. If you want to burnish your web dev cred on medium, write articles on it, and so on. As a result of this, the vast majority of front-end code on the public web is React-related, often tacitly assuming React imports, declarations, and patterns are being used. This is so pernicious that it's almost impossible to convince ChatGPT not to write code like this. I ask it to write vanilla js, it writes React. I ask it to read my documentation, it writes React.
And it's not good React code either. (Hilariously, I've heard OpenAI has just opened up a ton of ReactJS front-end dev positions.)
Stanford conducted a study showing that ChatGPT outperformed doctors in performing diagnoses based on patient histories and test results. Even if doctors have healthy skepticism about this result, it's been widely reported and people are going to use ChatGPT (given these stories) given that they cheerfully used WebMD for years when everyone told them not to.
Convincing doctors to change their prescribing habits is difficult and expensive. In most civilized countries, including the US until the late 90s / early 2000s when Big Pharma found a loophole) it is illegal for drug companies to advertise prescription drugs to consumers. Pharma sends reps to doctors, hospitals, and pharmacists to convince them to prescribe or dispense their drugs using means both fair and foul. It's a big business. Just writing software to help those reps market a few drugs for a single company in Australia paid me handsomely for a decade.
Now imagine you can just generate content that is designed and A|B-tested to influence AIs to recommend your drug. Would you do it? More importantly, would a multi-billion dollar company do it? A company that already spends buckets of money on pens, post-it notes, cardboard signs, "medical conferences" in tropical paradises, cruises, politicians, 24 year old communications grads who look incredible in stilettos and pencil skirts, TV ads, mentions in movies and medical dramas. I mean, they have some ethical standards, I'm sure. We just haven't found them yet.
As easy as it is to dumb down the populace by destroying the education system, inventing reality television and social media, etc. etc. AIs can do it faster and in real time.
And I'm talking about humans using AIs to corrupt other AIs, test whether it's working, and then doing the stuff that works more and faster and harder. This is what SEO did to Google.
Curation is a Partial Fix
I work for a company that processes invoices using custom AI models. The training data is highly curated and so everything basically just tends to get better over time. Find a mistake? It probably originated in the training dataset so we go fix the training data and then see if the retrained AI still makes the mistake. But we (a) control the data and (b) know what the correct answer is. This is very much not how LLMs work.
The people who build LLMs kind of control the data, but their goal is quantity first, quality second, and they have no idea what the question is, let alone what the correct answer is. The answer you'll get is just some kind of statistically generated likely answer that the random contributors to the dataset might give.
This isn't just a problem with generative AI, although it seems to be an especially bad problem for generative AI. When it comes to front-end coding, "AI" is currently "artificial stupidity". This is what happens when curation fails at scale.
All of this doesn't fully justify my clickbait title. Obviously, AI (and software as a whole) will continue to progress. But the thing that's currently generating all the hype—LLMs and generative AIs powered by all the content they can scrape— those may be peaking as they themselves are used to pollute the very resource they depend on at scale.
— Tonio Loewald, 1/21/2025
Recent Posts
Blender Gets Real
3/26/2025

Flow, the Blender-animated film, took home the Oscar for Best Animated Feature. But it's more than just a win for a small team; it's a monumental victory for open-source software and anyone with a vision and a limited budget.
The future's so bright… I want to wear AR glasses
2/4/2025

So much bad news right now… It's all a huge shame, since technology is making incredible strides and it's incredibly exciting. Sure, we don't have Jetsons-style aircars, but here's a list of stuff we do have that's frankly mind-blowing.
Contrary to popular belief, AI may be peaking
1/21/2025

Is artificial intelligence actually getting *smarter*, or just more easily manipulated? This post delves into the surprising ways AI systems can be tricked, revealing a disturbing parallel to the SEO shenanigans of the early 2000s. From generating dodgy medical advice to subtly pushing specific products, the potential for AI to be used for nefarious purposes is not only real but the effects are already visible.
Large Language Models — A Few Truths
1/17/2025

LLMs, like ChatGPT, excel at manipulating language but lack true understanding or reasoning capabilities. While they can produce acceptable responses for tasks with no single correct answer, their lack of real-world experience and understanding can lead to errors. Furthermore, the rapid pace of open-source development, exemplified by projects like Sky-T1-32B-Preview, suggests that the economic value of LLMs may be short-lived, as their capabilities can be replicated and distributed at a fraction of the initial investment.
Adventures with ChatGPT
1/17/2025

ChatGPT excels at mundane coding tasks, akin to a bright intern who reads documentation and StackOverflow but lacks creativity and testing. While useful for automating repetitive tasks, its code requires refinement and testing.
Apple Intelligence—Image Playground
1/15/2025

Apple's new Image Playground is focused, and easy to use. If you want to produce cute "Pixar-style" people and animals, it quickly churns out consistent, but very limited, results. My M3 Max rendered images in seconds, but right now it's more of a cute toy than a useful tool