Retry Errors in Web Scraping

The retry errors feature in web scraping agents allow us to retry failed requests automatically, to increase the chance of successful data scraping. When the retry-errors feature is enabled - Agenty will automatically retry the pages where 4xx-5xx status code is returned by the website you are crawling.

There may be several reasons of a web page returns the error code, like

So, the fail-retry feature in web scraping agents is designed to keep in mind all those errors. And, when it’s in use - Agenty will retry the same request with another IP address, user-agent or geo-proxy the agent is configured for.

Options

  1. Edit the agent by clicking on the Edit tab
  2. Scroll down to the Retry Errors section and Enable Failed Request Retry switch
  3. Set the Max retry(n) value between 1 and 10
  4. Set the Retry with Interval(seconds) value between 0 and 300 seconds, If you want to delay few seconds before retrying the same request again
  5. You may also set the Max Time to Spent in Retry(seconds) to tell Agenty when to stop if the error continues
  6. Then, Save the scraping agent configuration.
  7. Finally, run your agent.

Agenty doesn’t take any extra page’s credit when retry-errors is enabled, and you’ll be charged only for successful requests. So, it’s best practice for web scraping to have this feature enabled in your scraping agent.

Logs

To find out what request was errored and re-tried by your scraping agent, you may find it in your agent logs. For example, if you see this screenshot - The 502 HTTP error was retried 4 times

Retry Rules

Not just the status code. You can also use the advance rules option to define custom rules to tell Agenty when a request should be retried automatically.

For example, in this screenshot I added rule selector-not-match : .price to retry the web-page scraping, if it doesn’t have the matching selector given in the value.

Signup now to get 100 pages credit free

14 days free trial, no credit card required!