Jobs API

Agenty Jobs API is used to start new background jobs by given agent_id, to get the job result, download results in CSV format etc.

Start a job

This API will start a new asynchronous job for the given agent_id in the request body.

Endpoint:

Method: POST
URL: https://api.agenty.com/v2/jobs/start

Headers:

Key Value Description
Content-Type application/json

Query params:

Key Value Description
apikey {{API_KEY}}

Body:

{"agent_id":"{{AGENT_ID}}"}

Responses:

Status: OK | Code: 200

{
  "job_id": 3689994,
  "account_id": 59703,
  "agent_id": "45l0wewqzk",
  "type": "scraping",
  "status": "submitted",
  "priority": 5,
  "pages_total": 0,
  "pages_processed": 0,
  "pages_succeeded": 0,
  "pages_failed": 0,
  "pages_credit": 0,
  "created_at": "2022-01-10T11:32:18.2362494Z",
  "started_at": null,
  "completed_at": null,
  "stopped_at": null,
  "is_scheduled": false,
  "queue_time": null,
  "run_duration": null,
  "error": null,
  "ping_at": null,
  "assigned_worker_id": null,
  "running_worker_id": 1
}

Stop a running job

This API will send a stop request to Agenty workers running that particular job id in background.

Endpoint:

Method: GET
URL: https://api.agenty.com/v2/jobs/{{JOB_ID}}/stop

Headers:

Key Value Description
Content-Type application/json

Query params:

Key Value Description
apikey {{API_KEY}} Your api key

Responses:

Status: OK | Code: 400

{
  "job_id": 3687911,
  "account_id": 59703,
  "agent_id": "w9qmlp5475",
  "type": "scraping",
  "status": "stopped",
  "priority": 5,
  "pages_total": 0,
  "pages_processed": 0,
  "pages_succeeded": 0,
  "pages_failed": 0,
  "pages_credit": 0,
  "created_at": "2022-01-10T09:47:48Z",
  "started_at": null,
  "completed_at": "2022-01-10T09:49:16.8233821Z",
  "stopped_at": "2022-01-10T09:49:16.8233806Z",
  "is_scheduled": false,
  "queue_time": null,
  "run_duration": null,
  "error": null,
  "ping_at": null,
  "assigned_worker_id": null,
  "running_worker_id": null
}

Get job status by job id

Get the job status and other property associated with the job. E.g pages_credit, pages_processed etc.

Endpoint:

Method: GET
URL: https://api.agenty.com/v2/jobs/{{JOB_ID}}

Query params:

Key Value Description
apikey {{API_KEY}} Your api key

Responses:

Status: OK | Code: 200

{
  "job_id": 3689856,
  "account_id": 59703,
  "agent_id": "45l0wewqzk",
  "type": "scraping",
  "status": "completed",
  "priority": 5,
  "pages_total": 1,
  "pages_processed": 1,
  "pages_succeeded": 1,
  "pages_failed": 0,
  "pages_credit": 1,
  "created_at": "2022-01-10T11:10:55Z",
  "started_at": "2022-01-10T11:10:56Z",
  "completed_at": "2022-01-10T11:10:58Z",
  "stopped_at": null,
  "is_scheduled": false,
  "queue_time": null,
  "run_duration": null,
  "error": null,
  "ping_at": null,
  "assigned_worker_id": null,
  "running_worker_id": 3
}

Get job result by job id

This API will fetch the job result by given job id.

Endpoint:

Method: GET
URL: https://api.agenty.com/v2/jobs/{{JOB_ID}}/result

Query params:

Key Value Description
apikey {{API_KEY}} Your api key
offset 0 A number of lines to skip, for showing the next page. Must be number (int), use this to paginate when there are more than 2500 rows
limit 2500 A number between 1 and 2500 to display maximum number of rows per page. Must be number (int)
collection 1 The collection number you wants to fetch. Default is 1
modified 1 To fetch the modified result if post-processing script is used. By default is 1, to fetch the modified version when available or default otherwise. Use 0 if you want to force Agenty to fetch the default result only

Responses:

Status: OK | Code: 200

{
    "total": 2,
    "limit": 1000,
    "offset": 0,
    "returned": 2,
    "result": [
        {
            "name": "A Light in the ...",
            "price": "£51.77",
            "image": "http://books.toscrape.com/media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg",
            "details_page_url": "http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html"
        },
        {
            "name": "Tipping the Velvet",
            "price": "£53.74",
            "image": "http://books.toscrape.com/media/cache/26/0c/260c6ae16bce31c8f8c95daddd9f4a1c.jpg",
            "details_page_url": "http://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html"
        }
    ]
}

Get job logs by job id

This API will fetch the job logs

Endpoint:

Method: GET
URL: https://api.agenty.com/v2/jobs/{{JOB_ID}}/logs

Query params:

Key Value Description
apikey {{API_KEY}} Your api key
offset 0 A number of lines to skip, for showing the next page. Must be number (int), use this to paginate when there are more then 2500 rows
limit 2500 A number between 1 and 2500 to display maximum number of rows per page. Must be number (int)

Responses:

Status: OK | Code: 200

2022-01-10T07:46:08.145Z	INFO	Worker id: 2
2022-01-10T07:46:08.169Z	INFO	Job: {"job_id":3683463,"agent_id":"459ypoj1gk","account_id":59703,"type":"scraping","status":"started","priority":16,"created_at":"2022-01-10T07:45:57Z","pages_total":0,"pages_processed":0,"pages_succeeded":0,"pages_failed":0,"pages_credit":0,"is_scheduled":0,"assigned_worker_id":null,"running_worker_id":2,"running_server_ip":null,"error":null,"attempts":null}
2022-01-10T07:46:08.250Z	INFO	Input type: url
2022-01-10T07:46:08.268Z	INFO	Job id: 3683463, Type: scraping, Status: running
2022-01-10T07:46:08.268Z	INFO	Total inputs: 1
2022-01-10T07:46:08.367Z	INFO	Plan: Free
2022-01-10T07:46:08.368Z	INFO	Proxy type: Default
2022-01-10T07:46:08.395Z	INFO	Running page 1 of 1
2022-01-10T07:46:08.395Z	INFO	https://raw.githubusercontent.com/Agenty/Agenty.TestData/master/scraping/csv/top-usa-retailers-2011.csv
2022-01-10T07:46:10.221Z	INFO	Status: 200
2022-01-10T07:46:10.230Z	INFO	REGEX: (\d+)\,(.*?)\,(\d*),(\d+.\d+) extracted 100 match(s) for field Rank
2022-01-10T07:46:10.230Z	INFO	REGEX: (\d+)\,(.*?)\,(\d*),(\d+.\d+) extracted 100 match(s) for field Retailer Name
2022-01-10T07:46:10.230Z	INFO	REGEX: (\d+)\,(.*?)\,(\d*),(\d+.\d+) extracted 100 match(s) for field # Stores
2022-01-10T07:46:10.230Z	INFO	REGEX: (\d+)\,(.*?)\,(\d*),(\d+.\d+) extracted 100 match(s) for field Revenue
2022-01-10T07:46:10.246Z	INFO	Job 3683463 completed successfully
2022-01-10T07:46:17.758Z	INFO	Preparing files for backup...
2022-01-10T07:46:17.812Z	INFO	Gzip files...
2022-01-10T07:46:17.814Z	INFO	4 files gziped successfully
2022-01-10T07:46:17.833Z	INFO	Uploading 4 files to S3...
2022-01-10T07:46:17.967Z	INFO	collection1.csv.gz (Bytes: 1881) uploaded successfully
2022-01-10T07:46:17.967Z	INFO	collection1.json.gz (Bytes: 2050) uploaded successfully
2022-01-10T07:46:17.967Z	INFO	collection1.tsv.gz (Bytes: 1881) uploaded successfully
2022-01-10T07:46:17.967Z	INFO	input.txt.gz (Bytes: 117) uploaded successfully
2022-01-10T07:46:17.967Z	INFO	Backup completed successfully

Export job result by job id

This API will create a download link to download the job result or logs in CSV format.

Endpoint:

Method: GET
URL: https://api.agenty.com/v2/jobs/{{JOB_ID}}/export

Query params:

Key Value Description
apikey {{API_KEY}} Your api key
type result The type of file to export. Must be result or logs
collection 1 The collection number you wants to export. Must be 1 or greater. Default is 1
modified 1 To export the modified result if post-processing script is used. By default is 1 to export modified version when available, Use 0 if you wants to download the default result
filename output Use this to give custom name to your download file. Default is export.csv

Responses:

Status: Download job result by job id | Code: 200

{
    "downloadlink": "https://server1.agenty.com/Job_12995/output1.csv?signature=sdlfjasoywerxvjsaldfkjpwqeroiiu9123e7"
}

Get all jobs

Get all the jobs for all agents under an account

Endpoint:

Method: GET
URL: https://api.agenty.com/v2/jobs

Query params:

Key Value Description
apikey {{API_KEY}} Your api key

Responses:

Status: OK | Code: 200

{
    "total": 5,
    "limit": 1000,
    "offset": 0,
    "returned": 5,
    "result": [
        {
      "job_id": 3689527,
      "account_id": 59703,
      "agent_id": "lmqdjwd972",
      "type": "scraping",
      "status": "completed",
      "priority": 0,
      "pages_total": 1,
      "pages_processed": 1,
      "pages_succeeded": 1,
      "pages_failed": 1,
      "pages_credit": 1,
      "created_at": "2022-01-10T10:16:41Z",
      "started_at": "2022-01-10T10:19:59Z",
      "completed_at": "2022-01-10T10:20:01Z",
      "stopped_at": null,
      "is_scheduled": false,
      "queue_time": "00:03:18",
      "run_duration": "00:00:02",
      "error": null,
      "ping_at": null,
      "assigned_worker_id": null,
      "running_worker_id": null
    },
    {
      "job_id": 3689448,
      "account_id": 59703,
      "agent_id": "w9qmlp5475",
      "type": "scraping",
      "status": "completed",
      "priority": 0,
      "pages_total": 1,
      "pages_processed": 1,
      "pages_succeeded": 1,
      "pages_failed": 0,
      "pages_credit": 1,
      "created_at": "2022-01-10T10:01:58Z",
      "started_at": "2022-01-10T10:11:11Z",
      "completed_at": "2022-01-10T10:11:13Z",
      "stopped_at": null,
      "is_scheduled": false,
      "queue_time": "00:09:13",
      "run_duration": "00:00:02",
      "error": null,
      "ping_at": null,
      "assigned_worker_id": null,
      "running_worker_id": null
    },
    {
      "job_id": 3689064,
      "account_id": 59703,
      "agent_id": "w9qmlp5475",
      "type": "scraping",
      "status": "stopped",
      "priority": 0,
      "pages_total": 0,
      "pages_processed": 0,
      "pages_succeeded": 0,
      "pages_failed": 0,
      "pages_credit": 0,
      "created_at": "2022-01-10T09:59:05Z",
      "started_at": null,
      "completed_at": "2022-01-10T09:59:37Z",
      "stopped_at": "2022-01-10T09:59:37Z",
      "is_scheduled": false,
      "queue_time": null,
      "run_duration": null,
      "error": null,
      "ping_at": null,
      "assigned_worker_id": null,
      "running_worker_id": null
    }
    ]
}

Get jobs by agent id

Get all the historical jobs for given agent id

Endpoint:

Method: GET
URL: https://api.agenty.com/v2/jobs

Query params:

Key Value Description
agent_id {{AGENT_ID}} Your agent id
apikey {{API_KEY}} Your api key

Responses:

Status: OK | Code: 200

{
    "total": 2,
    "limit": 1000,
    "offset": 0,
    "returned": 2,
    "result": [
        {
      "job_id": 3689773,
      "account_id": 59703,
      "agent_id": "45l0wewqzk",
      "type": "scraping",
      "status": "completed",
      "priority": 0,
      "pages_total": 1,
      "pages_processed": 1,
      "pages_succeeded": 1,
      "pages_failed": 0,
      "pages_credit": 1,
      "created_at": "2022-01-10T10:55:27Z",
      "started_at": "2022-01-10T10:55:28Z",
      "completed_at": "2022-01-10T10:55:30Z",
      "stopped_at": null,
      "is_scheduled": false,
      "queue_time": "00:00:01",
      "run_duration": "00:00:02",
      "error": null,
      "ping_at": null,
      "assigned_worker_id": null,
      "running_worker_id": null
    },
    {
      "job_id": 3683466,
      "account_id": 59703,
      "agent_id": "45l0wewqzk",
      "type": "scraping",
      "status": "completed",
      "priority": 0,
      "pages_total": 1,
      "pages_processed": 1,
      "pages_succeeded": 1,
      "pages_failed": 0,
      "pages_credit": 1,
      "created_at": "2022-01-10T07:47:49Z",
      "started_at": "2022-01-10T07:48:18Z",
      "completed_at": "2022-01-10T07:48:21Z",
      "stopped_at": null,
      "is_scheduled": false,
      "queue_time": "00:00:29",
      "run_duration": "00:00:03",
      "error": null,
      "ping_at": null,
      "assigned_worker_id": null,
      "running_worker_id": null
    }
    ]
}

Signup now to get 100 pages credit free

14 days free trial, no credit card required!