Comments Page - Google Books has been effectively killed by the last algorithm update

« Back Google Books has been effectively killed by the last algorithm updateold.reddit.comSubmitted by adamnemecek 2 hours ago

al_borland 2 hours ago
It might be time to update the mission statement.
“Our mission is to organize the world’s information and make it universally accessible and useful”
https://about.google/company-info/
- zb3 an hour ago
  * for us, advertisers and our AI models
  ern_ave 26 minutes ago
  My guess is that AI training is the main issue.
  Data that you can prove was generated by humans is now exceedingly valuable ...and most of that comes from the days before LLMs. The situation is a bit like how steel manufactured before the nuclear age is valuable.
  adamnemecek 23 minutes ago
  But why would people train on excerpts from Google Books when the whole books can be downloaded on libgen and such?
  asdefghyk 8 minutes ago
  copyright reasons?
  direwolf20 5 minutes ago
  Both are a copyright violation
abetusk an hour ago
Anna's Archive [0]:
> The largest truly open library in human history
[0] https://en.wikipedia.org/wiki/Anna%27s_Archive
- cft 36 minutes ago
  Mirrors https://open-slum.org/
bryanrasmussen an hour ago
Since I pretty much only use Google Books for public domain books, old magazines, and newspapers I haven't noticed any problem with it. Maybe it's not as dead as this person thinks.
- mikestew 9 minutes ago
  This was addressed in the post, I'm sure you just missed it when you read it:
  "But a few days ago they removed ALL search functions for any books with previews, which are disproportionately modern books." <emphasis mine>
- adamnemecek 38 minutes ago
  No the search results went from pretty good to absolute garbage https://bsky.app/profile/adamnemecek.bsky.social/post/3mdbup...
xorsula1 2 hours ago
My guess is they detected being scraped and did this as preventive measure.
- breppp an hour ago
  my guess is that the copyright landscape changed due to AI training, and these publishers won't let Google use that data anymore
  adamnemecek an hour ago
  The books are still there, it seems like the rankings have changed though.
mystraline 2 hours ago
Thats easy.
Check out library genesis, Anna's archive, and scihub for content.
Piracy isnt theft if buying isnt ownership.
- GorbachevyChase 13 minutes ago
  Ironic those doing the most for making information open and accessible are the criminals.
  direwolf20 4 minutes ago
  Of course. When it's criminal to make information open and accessible, only criminals will make information open and accessible.
- adamnemecek 2 hours ago
  None of these does full text search.
  jszymborski 2 hours ago
  And they are under constant threat by nation states. sci-hub hasn't seen new papers in ages.
  greenavocado 2 hours ago
  Build a local index
  adamnemecek 2 hours ago
  My problem is finding references I don't know about.
  droopyEyelids an hour ago
  zlibrary does
  https://en.wikipedia.org/wiki/Z-Library
  clueless 14 minutes ago
  I'd wonder if you'd ever consider putting up a downloadable mirror of their full-text search db?
  adamnemecek an hour ago
  Huh, the search is not amazing but it will have to do. Thanks! Are there others?
  teraflop an hour ago
  The Internet Archive supports full-text search on (AFAIK) its entire scanned book collection, even books that aren't available for borrowing.
  adamnemecek 23 minutes ago
  This is actually pretty good.
kingstnap 2 hours ago
My guess: Text search and indexing is expensive. And you are getting some kind of AI vector search instead.
Which tends to be kind of poop compared to true text search.
ChrisArchitect an hour ago
Title is: Google has seemingly entirely removed search functionality from most books on Google Books
adamnemecek 2 hours ago
The change happened on or around Jan 21. Overnight the results went from pretty good to absolute trash.
Here are two screenshots taken on Jan 20 and Jan 23 https://bsky.app/profile/adamnemecek.bsky.social/post/3mdbup...
They don't do full text search anymore esp for copyrighted books. I wonder if this is not a regression but an intent to give them a let up in the AI race.
- jeffbee an hour ago
  It isn't obvious why the left results are preferred over the right results.
  advisedwang an hour ago
  The left results are contemporary, the right are decades old. That includes editions of the same book --- surely the newer edition is going to be preferred by most readers.
  thaumasiotes 2 minutes ago
  > surely the newer edition is going to be preferred by most readers.
  Why? Where different editions exist, the reader will want to know which one they're getting, but they're unlikely to systematically prefer newer editions.
  But also, Google Books isn't aimed at "readers". You're not supposed to read books through it. It's aimed at searchers. Searchers are even less likely to prefer newer editions.
  jeffbee an hour ago
  I guess. That's not immediately clear to me. However, browsing around on Google Books suggests to me that it is the corpus which changed, not the algorithms.
  adamnemecek an hour ago
  The corpus is still the same, like searching the name of the book will find it, but the full text search.