Scrape University Course Information into JSON

This automation scrapes detailed information about university courses from their official web pages, including international student entry requirements, and structures the data into a clean, easy-to-use JSON format.
This automation systematically gathers comprehensive data on university courses, making it ideal for analysis and comparison. Here is a step-by-step overview of its behavior:
  • Receives Input: The process begins with a user-provided list of university course URLs.
  • Navigates and Prepares: For each URL, the automation opens the page and automatically clicks on the "International" students tab to ensure the correct information is displayed.
  • Extracts Core Data: It intelligently scans the page to extract key details such as the course name, duration, campus locations, fees, and a general description.
  • Finds Entry Requirements: The automation locates and expands the "Entry Requirements" section to capture English proficiency scores (IELTS, PTE) and country-specific academic requirements.
  • Performs Deep Searches: If crucial information, like TOEFL scores or specific country requirements, is not found on the main page, the automation navigates to dedicated pages on the university's website (e.g., "English Language Requirements," "Country-Specific Requirements") to find it.
  • Processes PDFs: It identifies and downloads any linked PDF brochures or course guides, extracting supplementary data from them to fill in any gaps.
  • Handles Country-Specific Logic: The automation is configured to parse entry requirements for a specific country (by default, India) and intelligently categorize them by educational board (e.g., CBSE, CISCE).
  • Compiles and Outputs: Finally, it aggregates all the information gathered from the webpage, supplementary pages, and PDFs into a single, well-structured JSON object for each course. The end result is a JSON file containing an array of all the processed courses, ready for use in other applications, databases, or analysis tools.
Usage Ideas
  • Build a Course Comparison Tool: Use the structured JSON output to populate a database for a website or app that allows students to compare courses across different criteria.
  • Power Student Advisory Services: Education agents and counselors can use this to quickly gather up-to-date information for advising international students.
  • Conduct Competitive Analysis: Universities can automate the collection of data on competitor institutions' course offerings, fees, and entry requirements.
  • Perform Market Research: Analyze trends in higher education, such as the most common course durations, the rise and fall of certain subjects, and fee structures.
Customization Ideas
This template is a powerful starting point that you can easily tailor to your specific needs. You have the flexibility to:
  • Target Any University: While designed for one university, you can adapt it to scrape course information from any other university website.
  • Specify Any Course List: You can provide your own list of course URLs to gather data on the programs that matter most to you.
  • Change the Target Country: The template is set up to find entry requirements for students from India. You can easily change this to any other country, such as China, Nigeria, or Brazil.
  • Customize Data Extraction: You can modify the automation to look for different pieces of information. For example, you could add fields for "scholarship opportunities," "career outcomes," or "faculty members."
  • Adjust the Output Format: You can change the names and structure of the fields in the final JSON output to match the requirements of your own systems or databases.
Agent Inputs
Required Parameters
Name
Type
Default
courseUrls
array<string>
[]
List of UTAS course URLs to extract information from (e.g. https://www.utas.edu.au/courses/sci-eng/courses/p3t-bachelor-of-information-and-communication-technology)
Optional Parameters
Name
Type
Default
maxCoursesToProcess
number
-1
Maximum number of courses to process from the list. Use -1 to process all courses.