@si1very@Making8@searchmartin@JohnMu@googlesearchc@Google You summoned me?
We do indeed sitemap files sometimes, mostly when someone links to them for whatever reason. If you don't like to see them in the results, add the noindex header Martin suggested.
PS welcome to my daily Twitter digest
@andrea_moro@rustybrick like, random one-off crawls we used to do for learning something. Wanna learn how many sites *require* an accept language request header? run a crawl job. This used to be with googlebot, now it's gonna be with GoogleOther.
Please don't overthink it, it's really that boring.
@rustybrick Good news! You're in my daily roundup!
Just wanted to reiterate that no one should ever rely on this. We can only ever discover one URL with this method and it's way, way more likely we'd crawl that one URL if you'd seeded it through Search Console for example. so just don't
@hellemans it's a bit more subtle than "news doesn't render". when you need to index fast, you need to have tighter deadlines and that might mean, for example, that we end up with the raw HTML in serving for a few minutes. It she's eventually fixes itself, but still
@JohnMu@jdevalk@facan Just to be clear about the rationale behind that new doc: we got documentation feedback asking for an example of combining extensions. That's the only reason we added that new doc.
Long term I have no problems with working together to improve the sitemaps protocol, just not now