Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a setting to adjust the speed of crawl per page #117

Open
DavidMacDonald opened this issue Mar 26, 2020 · 1 comment
Open

Provide a setting to adjust the speed of crawl per page #117

DavidMacDonald opened this issue Mar 26, 2020 · 1 comment

Comments

@DavidMacDonald
Copy link

Sometimes I get filtered as a spam bot if I'm crawling a site, and this is sent to a central service which forbids my ID from many sites.

I think we could solve this by slowing down the call rate. I think it is about 1-2 seconds a page now maximum. It would be good to be able to adjust it over a wide range.

@DavidMacDonald
Copy link
Author

DavidMacDonald commented Mar 26, 2020

Some web sites have automated monitors that log the IP address of sources that fire a lot of web calls in a short amount of time. There are services that large corporations subscribe to where this information is shared. When an IP gets flagged then all the sites that subscribe to the service block the IP. I've found myself getting blocked for weeks from shopping sites etc...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant