What is Playwright and how to use for web scraping and testing?

Playwright is a Node.js library developed by Microsoft for automating web browser interactions. It allows users to control browsers to perform tasks like capturing screenshots, generating PDFs, crawling Single-Page Applications (SPAs), and automating form submissions. Playwright offers a high-level API for both headless (UI-less) and non-headless browsers.

With support for Chromium, Firefox, and WebKit through a single API, Playwright is an excellent tool for cross-browser testing, enabling consistent functionality across different browser engines.

Playwright is mainly used for web scraping and website testing automation :-

Web scraping

To use playwright for web scraping we can just navigate a page and extract data using CSS selectors -
In this example -

  1. Launch the browser.
  2. Open the website URL.
  3. Extract the title, h1 tag using page function.
const { chromium } = require("playwright/test");

(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto("https://scrapingsandbox.com/products");
  const elementText = await page.textContent("h4");
  console.log("Extracted Text:", elementText);
  await browser.close();
})();

To run the script in VS code, just use command node extract.spec.js

Website testing

Here are some main features of Playwright:

  1. Browser support: Playwright is compatible with all major browsers, including Chrome, Firefox, and Safari, enabling you to test your websites across different platforms.

  2. Headless mode: Playwright can operate in both headless (UI-less) and headed (visible) modes, giving you the flexibility to observe the automation process or run it in the background.

  3. Network interception: Playwright enables network activity interception, allowing you to mock or verify network responses like API, fetch requests as needed.

  4. Mobile device emulation: Playwright supports emulating mobile devices, making it easy to test responsive designs and mobile-specific functionality for iOS, and Android devices.

  5. Multiple pages: With Playwright, you can handle multiple browser pages simultaneously, controlling each independently for better performance and to improve the scraping speed.

Screenshot example using Playwright -

import {test} from '@playwright/test';
test.only('page screenshot', async() => {
  const browser = await test.chromium.launch({ headless: false });
  const context = await browser.newContext({
    viewport: { width: 1280, height: 720 },
  });
  const page = await context.newPage();
  await page.goto("https://scrapingsandbox.com/products");
  await page.screenshot({ path: "products.png", fullPage: true }); 
  await browser.close();
});

To run the script in VS code, just use command npx playwright test screenshot.spec.js

To see the report, use command npx playwright show-report which will open the reports in the browser.

Signup now to get 100 pages credit free

14 days free trial, no credit card required!