How to accept cookie consent prompts before scraping the website

When scraping a website using Agenty’s scraping tool, cookie consent prompts are displayed. Sometime we must dismiss or accept these prompts before continuing the web scraping agent to extract data or capture screenshot automatically.

The consent cookies are a standard part of the e-commerce experience, but they need to be set one at a time on the browser, leading to a lot of tedious clicking. It’s mandatory to accept on some websites especially in European region websites, where we must click on “Accept” button to continue the website or it won’t load the product or pages you are looking to scrape the data from.

There are multiple ways in Agenty to click on a button to accept/reject cookie consent automatically.

  1. Native option to turn on/off
  2. Using the login commands to perform one time action
  3. Using JavaScript to click on button
  4. Using Playwright/puppeteer code in developer mode

Native option to accept cookies consent

There is a native on/off option in web scraping and crawling agent to specify if you want Agenty to click on accept cookies button when visiting a website for the first time.

 Agent > configuration > browser settings

When enabled, Agenty will find any active popup, modal with button ‘accept cookie’, ‘allow cookies’ etc. to click on it for consent and continue the crawling thereafter.

Accept cookies with commands

The consent cookies are usually one-time prompts and won’t appear on sequential pages in the same session, which makes it a perfect candidate to automate through the login feature, as the login commands are run only once before the scraping starts for a given URLs in input.

The login feature allows us to execute several commands in sequence to perform an interactive action on a website e.g. login, select location or region, zip code etc.

Follow these steps to simulate a click to accept or reject cookies.

  • Add the navigate command to open the website home page or any other page.
  • Add the click command to click on accept or reject button

Accept cookies with JavaScript

We can also inject a small JavaScript function in waitFor option in Agenty to click on accept cookies button, close modal etc.

This allows us to check and execute our script after each page load, as opposed to the login feature which was supposed to run once only.

Here is an example code to click on accept cookie button if one found after the page load

var cookieBtn = document.querySelector('#onetrust-accept-btn-handler');
if(cookieBtn){
  cookieBtn.click();
}

Remember, I am using if to check whether the cookie button is present or not to avoid undefined errors by JavaScript.

Accept cookies with Puppeteer and Playwright

If you are using the Agenty’s developer mode. You can add this code in the page.evaluate function to inject a function after page.goto to accept or dismiss the cookies consent.

// Go to the `url` from input request, extract title and return the results
 
export default async ({ page, request }) => {
    const response = await page.goto(request.url);            
    console.log(response.status());
    
    const pageTitle = await page.title();
 
    // Accept cookies
    await page.evaluate(() => {
        var cookieBtn = document.querySelector('#onetrust-accept-btn-handler');
        if(cookieBtn){
            cookieBtn.click();
        }
    });
    
    return {
        data : { title : pageTitle },
        type : 'application/json'
    };   
};

Signup now to get 100 pages credit free

14 days free trial, no credit card required!