How to accept cookie consent prompts before scraping the website

When scraping a website using Agenty’s scraping tool, cookie consent prompts are displayed. I must dismiss or accept these prompts before continuing the web scraping agent to extract data or capture screenshot.

The consent cookies are a standard part of the e-commerce experience, but they need to be set one at a time on the browser, leading to a lot of tedious clicking. It’s mandatory to accept on some websites especially in European region websites, where we must click on “Accept” button to continue the website or it won’t load the product or pages you are looking to scrape the data from.

Accept cookies with commands

The consent cookies are usually one-time prompts and won’t appear on sequential pages in the same session, which makes it a perfect candidate to automate through the login feature, as the login commands are run only once before the scraping starts for a given URLs in input.

The login feature allows us to execute several commands in sequence to perform an interactive action on a website e.g. login, select location or region, zip code etc.

Follow these steps to simulate a click to accept or reject cookies.

  • Add the navigate command to open the website home page or any other page.
  • Add the click command to click on accept or reject button

Accept cookies with JavaScript

We can also inject a small JavaScript function in waitFor option in Agenty to click on accept cookies button, close modal etc.

This allows us to check and execute our script after each page load, as opposed to the login feature which was supposed to run once only.

Here is an example code to click on accept cookie button if one found after the page load

var cookieBtn = document.querySelector('#onetrust-accept-btn-handler');
if(cookieBtn){
  cookieBtn.click();
}

Remember, I am using if to check whether the cookie button is present or not to avoid undefined errors by JavaScript.

Accept cookies with Puppeteer and Playwright

If you are using the Agenty’s developer mode. You can add this code in the page.evaluate function to inject a function after page.goto to accept or dismiss the cookies consent.

// Go to the `url` from input request, extract title and return the results
 
export default async ({ page, request }) => {
    const response = await page.goto(request.url, {waitUntil: 'networkidle2'});            
    console.log(response.status());
    
    const pageTitle = await page.title();
 
    // Accept cookies
    await page.evaluate(() => {
        var cookieBtn = document.querySelector('#onetrust-accept-btn-handler');
        if(cookieBtn){
            cookieBtn.click();
        }
    });
    
    return {
        data : { title : pageTitle },
        type : 'application/json'
    };   
};

14 days free trial

Automate your business with advanced, fully-featured agents on Agenty. Fast, scalable and no-code web automation tool for scraping, change monitoring and more...

Sign up for free →

No credit card required