

I guess I don’t really see the problem with that though. There are configuration levers you could be pulling, but those sites you’re hosting are not. There are lots of shady questions about how these models are getting training data, but crawlers have a well defined opt out mechanism.
The web would not be what we know it as without them, because it’s how you find sites. Why shouldn’t Alta Vista have one? I don’t object to what Alta Vista does with the data.





But meta’s will, and Alta Vista. I’m not angry at them when a script kitty makes a bad crawler