Input Type can be used to connect the agents through the URL. There are 4 Input types in Agenty.
- Source URL Only
- Manual URLs
- Select a URL List
- URL From Source Agent
Source URL Only
When we create an agent from a URL, this URL is known as the source URL for that particular agent. There is a mandatory source URL. We can only edit the source URL but not remove it. Here we have created an agent with 3 fields (url, title, price) as shown in screenshot below.
Now, we can select the source URL manually.
Steps
- Go to your Scraping Agent page
- Click on the
Inputtab - Now select the Input Type as “Source URL Only”
-
Savethe input configuration - Now, re-run the agent to execute the job for the selected source URL.
Manual URLs
Manual URLs also used for extracting the bulk amount of data of different pages with the same structure provided by the link. For Example, I have these two URL:
- https://cdn.agenty.com/sample_content/list/simple-list.html
- https://cdn.agenty.com/sample_content/list/list-2.html.
If you see, the structure of given URLs are the same. So, I created the agent of the first URL https://cdn.agenty.com/sample_content/list/simple-list.html with 5 fields (URL, Name, Brand, Color, Price) as given in the screenshot below.
Before Manual URLs
Now, I put manually all URLs in my scraping agent(Manual URLs Example) to get the same fields.
Steps
-
Go to your Scraping Agent page
-
Click on the
Inputtab -
Now select the Input Type as “Manual URLs”
-
Put another URL in URLs List
-
Savethe input configuration -
And, re-run the agent to execute the job for selected “Manual URLs”.
After Manual URLs
Now. If you see the updated result, the agent consists of another URL values.
Select a URL List
Select a URL List Input type allow us to create and manage large numbers of input/URLs in agents input, because we can’t enter a lot of URLs in manual input text area on agent page, which might freeze your browser due to size of in-memory text. This feature is helpful especially when we are scraping a big website with same structure and we have more than 5000 URLs list. For Example we have this scraping agent (“Select a URL List Example”)with 4 fields (URL, title, price, avilablity, description).
Before
Now we want to take more URLs field so, we are using input type Select a URL List.
Steps
- Click on the
Input taband select Input type as “Select a URL List” - Click on the
Create new listbutton to create a list, now you appear a list page - Enter the list Name and then choose the delimited file to upload
- Select the “Delimiter” as per your file. For example, Comma(,) separated for CSV
- And click on check box of
Has headers?if your file has the headers or un-check if no headers and Agenty will
auto-generate the heading with names like Field1, Field2… - Before uploading the file, you need to click on the
Upload Previewbutton to ensure that Agenty is reading the file correctly with settings which you have applied - If you see the data is populated correctly in table preview, click on the
Confirm uploadbutton to finally upload the file - Now come back on
Input tabpage and Select the list which you want to use as input - Finally, select the field which contains the URL in your list
-
Savethe input configuration - And re-run the agent to see the updated result.
URL From Source Agent
URL From Source Agent input type can be used to connect List and Details agent. List scraping agent is source agent and Details scraping agent is used for extracting data individually using URL from the List scraping agent. It is also used for extracting the bulk amount of data of different pages provided by the link. For Example, I have this source URL https://books.toscrape.com/catalogue where the content is displaying by this URL, And if you look on the content then you find the different “Page URL” corresponding with “Website URL”. Now we create the scraping agent for both fields.
Steps
-
Create the
List agentwith 2 fieldsPage\_URLandWebsite\_URL. -
Create the
Details agentwith 4 fields (Title, URL, Price, Availability). -
Now go to
Inputtab inDetails agent -
Select Input type as “URL from Source Agent”
-
Select the
List agentin select the Agent drop-down list -
Select “Collection1.Page_URL” in select the Field contains URL drop-down list
-
Savethe input changes - And, re-run the agent to see the updated result.








