Quick update today to resolve a bug with blank pages that send HTTP 200 rescodes, and to update some outdated dependencies.
To upgrade: https://t.co/U994xM1HJP
Super update, je pense surtout pour la détection de langue avec Fasttext. Je pique l'idée, j'avais pas vu que dans la doc il avait mis en place un model tout prêt. Je m'emmerdais pour rien 😂 En tout cas super ajout sur le crawler allez voir ça 👍
v0.3 is finally here! Exclusion patterns, custom XPath extractors, proxies, language detection... See the full list of new features here: https://t.co/fRZenkNiz8
v0.3 is finally here! Exclusion patterns, custom XPath extractors, proxies, language detection... See the full list of new features here: https://t.co/fRZenkNiz8
A client of ours brought up this SEO-focused crawl library that I wasn't aware of (@crowltech) https://t.co/MT5BW5R8E6
It abstracts some of the stuff you'll likely need to build if you're scraping.
Sharing because it was new to me.
Crowl v0.2 is out! Config files & CSV export are the main features. Check out the release notes: https://t.co/bx9J0SxhyA
And the new docs: https://t.co/Zzi4JilJLs
Feedback is welcome!