Yesterday’s brief history of Internet search carries a lot in between its lines, some of it increasingly sinister. From the assumption that search results should be ranked by popularity rather than quality of content, to Google’s latest AI overviews, so much runs counter to all we had come to learn in previous millennia.
Many of our greatest insights and ideas have been far from popular at the time, and some have been so reviled that their authors have been ostracised as a result. Indeed, the origin of the term ostracisation refers to a practice that the ancient Greeks recognised led to popular but flawed outcomes, when the great were rejected by ill-informed opinion of the mob.
By a quirk of fate, the screenshot of Google Scholar in use showed search results from 2011 for the terms autism vaccine
, a topic that has recently returned to the headlines. Claims made by some of today’s politicians have been propagated using the same principles as PageRank until millions of people have been fooled into believing what were demonstrably fraudulent results. The mob are about to throw away decades of public health improvements for the sake of palpable lies.
We now have new tools to amplify such nonsense, in ‘AI’ built on large language models, and they’re starting to supplant search. In doing so, they’re going to destroy the raw material they feed on to generate their summaries.
Before about 2000, the great majority of information was printed on paper. There must have been a dozen or more specialist Mac magazines, and a steady stream of popular books about Mac OS and how to get the best from it. Even Apple was a prolific originator of thoroughly well written reference guides in its Inside Macintosh series, published by Addison Wesley. In the following couple of decades, most of those vanished, replaced by websites financed by advertising income, hence the industry dominated worldwide by Google.
Blogs originated in the mid-1990s and by about 2010 had reached a peak in their numbers and influence. Since then many have ceased posting new articles, or simply vanished. The generation that took to the web around 25 years ago are now trying to retire, sick of spam comments and the vitriolic spite of those that abuse them. Unsurprisingly the next generation are less enthusiastic about taking to their blogs, leaving some to make money from ephemeral video performances.
If there’s one thing that Google could have done to further the decline of the remaining online publications and blogs it’s to plunder their contents, massage their words with the aid of an LLM, and present those as overviews. When you’ve researched an article over several days and spent many hours writing and illustrating it, it’s more than galling to see an AI present its paraphrase as its own work.
These AI overviews range from the accurate, through repetitious waffle, to those riddled with errors and contradictions. Had they been written by a human, I’d describe them as a shameless and inaccurate plagiarist who has little or no understanding of what they’re plagiarising.
You can see examples of this by making quick comparisons between Google’s AI overview and the articles that it links to. For instance:
- Ask Google “what is the boot volume structure in ios?” and compare that overview with this article. For added entertainment, try the same with iPadOS, and spot the differences.
- Ask “what does runningboard do in macos?” and notice how sources given date from 2019 and 2021, when RunningBoard had only just been discovered. Refer to a more recent account such as that here, to see how out of date that overview is, and how much it has changed in Sequoia.
There’s also an element of unpredictability in those overviews. Repeat one after a couple of minutes, and the results can be quite different.
Although Cloudflare has developed a method that enables commercial publishers to control Google’s ability to scrape their content and plagiarise it, for the great majority of us, there seems little we can do but watch page views continue to fall to levels below those before the Covid pandemic. If you’ve got something better to do with your time than write for your blog, this is when you get seriously tempted.
But Google is digging a deep hole for its future. As the supply of new content to feed its LLM falls, most new articles will be generated by AI. All it will have to plagiarise then will itself be plagiarism, and it will amplify its own errors. By not referring searches to content, Google will also have killed the geese that lay its golden eggs, and lost much of its advertising revenues.
We’ll then be back full circle to curated web directories of the remaining reliable sites.