Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I guess the google bot is now on by blocked list.

There should be a public corporation, like Wikipedia, that starts scraping the Web and provides an api for anyone to access.



you mean this ? https://commoncrawl.org/


The Internet Archive?


But does Google actually 'own' the index? Have they acted anticompetitively as far as that goes?


That's a very good question that few people consider. Here is the answer: https://knuckleheads.club/how-google-distorts-the-market/


They own tons of engagement data associated with that index from being the default search engine on most devices.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: