Login and Scrape Data from a Web Portal
This automation logs into any website that requires a username and password, handles two-factor authentication (OTP), and systematically scrapes structured data, such as records from a table, across multiple pages. It's a robust starting point for extracting information from gated web portals, CRMs, or supplier dashboards.
This automation is designed to reliably extract data from behind a login screen. Its behavior follows these steps:
- Navigates to a specific website URL.
- Checks Login Status: It intelligently determines if it's already logged in or needs to authenticate.
- Logs In: If needed, it enters an email and password from a secure vault.
- Handles Two-Factor Authentication: It can enter a one-time password (OTP) from a vault to complete the login.
- Verifies Success: It confirms that the login was successful by checking for expected content on the dashboard page.
- Scrapes Data: It locates a table of data and extracts information from each row, field by field.
- Handles Pagination: It automatically clicks the "Next" button to navigate through all available pages, scraping data from each one.
- Returns Data: Once completed, it returns all the scraped data in a clean, structured JSON format.
Usage Ideas
- Scrape sales leads from a private business directory or a members-only industry group.
- Monitor product information, inventory levels, or pricing from a supplier's web portal.
- Extract order history or shipping statuses from an e-commerce or fulfillment platform for internal analysis.
- Aggregate financial transaction data from multiple online banking or investment portals into a single source.
- Gather contact information from an internal employee directory or a customer CRM.
Customization Ideas
This template is a powerful accelerator for creating your own web scraping automations. You can work with an assistant to customize it for your specific needs. You have the flexibility to:
- Target any website you need to get data from by simply changing the URL.
- Use your own login credentials by pointing the automation to your secure vault secrets.
- Define the exact data you want to collect. You can specify which fields to extract from a page, such as names, prices, dates, statuses, or contact details.
- Adapt to different login forms. The automation can be easily told what the "email," "password," and "submit" buttons are called on your target site.
- Control the scope of the scrape. You can set a limit to scrape only a certain number of items or configure it to get everything available.
Agent Inputs
Optional Parameters
Name | Type | Default |
|---|---|---|
maxJobsToProcess | number | -1 |
Maximum number of jobs to return. Set to -1 to return all available jobs. | ||