> When I am searching for something, I usually want to find primary sources.
And therein lies the rub; for years now Google's search results have returned useless SEO garbage. For now, it definitely seems like an LLM answer is better than what was being returned and I guess this is the reason why Google ripped it out.
An LLM answer is not "better", it's in a completely different category. LLM answers can be useful, for topics where you can easily verify a fact (i.e if you ask for a Linux command and it gives you one, you can run it and see if it did what you wanted), or for topics which are more opinion than pure fact ("list some trade-offs between decision A and decision B"). But when you want information that's provided by some authoritative source, you want to see it from that source.
Google Search has been terrible for a long time. But you could still dig through it and find those primary sources. That is, in my opinion, the primary purpose of a search engine. Replacing it with what an LLM has invented based on ingesting both reliable and unreliable sources is not viable for a large category of things. The main way we can judge the reliability of something is to loo at where it comes from. If I'm looking for, say, official US job market statistics, whether I trust the numbers I find depends on whether I find them published on a US government website or on a random person's blog. A number presented to me by a chat bot would not let me judge, so it's useless.
The best a language model could possibly do, by definition, is to find websites and link them to me, letting me judge their credibility. But then it's just a worse search engine.
Personally I think I've developed a pretty good sense of when a question is easy enough that I can just trust the AI overview, and when I need to dig deeper. Google's original AI overviews were not reliable enough to ever trust, but now they are usually accurate summaries of the cited sources.
Job market statistics are actually probably a strong point for the AI overview. I just Googled 'us job market last month' and got an AI overview that accurately summarized a New York Times article for qualitative information ("surprisingly strong 115,000 jobs", "no-hire, no-fire"), followed by accurately summarizing the official Bureau of Labor Statistics website for raw stats, followed by some other stuff I didn't check. Not everyone would prefer The New York Times' take, but the citation prominently displays their name and logo, so you can tell what you're getting.
Weak points are when the topic is obscure enough that the AI overview conflates two different things or overgeneralizes, or trusts the wrong sources.
The training process literally ingests the majority of text on the internet, including a huge volume of SEO garbage, and seeks to create a self-consistent compressed model of that. This is totally imperfect of course but is also likely more truthful than the median Google result, because of the incentive for self-consistency and coherence that is created by the reward function as well as during RL.
Imagine that you had 1,000 years to read every Google result on a particular topic, and literally infinite patience. You would read a lot of rubbish but ultimately you are a smart person, you would figure out the underlying truth and likely produce something that is more valuable than the average or even the sum of the parts.
And therein lies the rub; for years now Google's search results have returned useless SEO garbage. For now, it definitely seems like an LLM answer is better than what was being returned and I guess this is the reason why Google ripped it out.