Join Two or More Extracted Rows into Single Cell

The join option in scraping agent allows you to combine multiple extracted values into one cell. This option is helpful especially when you are scraping some element with multiple matches and want to combine that into a single delimited string.

For example, if you are scraping a product website and the product page displays multiple categories, sizes, images or color variants scraping. The scraping agent will display each result in separate row by default. So, having an option to join two or more extracted results is helpful in transforming the data in desired format. So, we can use the Join option to combine all matches in a single cell to make our data table as one-product, one-row.

Example

If we see this product page screenshot, the product has the category as Home > Books > Poetry and then the book name. And using the .breadcrumb a selector extracted 3 matches in separate rows, while we have product_name and price on the1st row.

http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html

Before

Steps

  1. Edit the scraping agent by clicking on the Edit tab
  2. Go to the field you want to join. In this case Category and then enable the Join switch
    Enable the join result option in scraper
  3. Then Save the scraping agent configuration
  4. And finally, re-run the scraping agent to apply the changes.

After

After executing your web scraping agent, you’ll see that the field result will be joined in a single cell. As in this screenshot below for Category column.

Custom Delimiter

The default join delimiter is comma(,). But, you may also pass a custom delimiter using JoinDelimiter Post-processing function to tell Agenty what delimiter should be used to club the values.

For example, If I want to use the semicolon (;) delimiter - I can add a post-processing function in this field to provide a custom delimiter as in this screenshot.

And re-running the web scraper will result in custom delimiter used

Signup now to get 100 pages credit free

14 days free trial, no credit card required!