Job Board Web Scraping – How To Scrape Workopolis
In this guide, we will show you how to scrape hiring companies, job listings, locations and salaries from the Workopolis website; all with our free web scraper, ParseHub!
Workopolis is one of Canada’s top job boards, founded in 2000 in Toronto. It generates 25-50 million dollars in revenue and was acquired by Recruit Holdings, which also operates Glassdoor and Quipper. The website is very similar to Indeed, another job board which is owned by Recruit Holdings. At the end of this blog, we will link all our other job board scraping tutorials, however, the scraping process is very similar.
Let's start scraping jobs!
Step 1: Scraping Hiring Companies
- Begin by opening ParseHub on your PC, Mac or Linux system.
- Create a new project with the “New Project” button.
- Enter the Workopolis URL you wish to scrape, our URL is going to scrape Data Scientist jobs in Toronto: https://www.workopolis.com/jobsearch/find-jobs?ak=data+scientist&l=Toronto%2C+ON&job=-0SVgzcemakeSpFBTlAMUNvPDo2zTIlxSG6uqHu2CmSmcWm0IgTXw9othw_fLgo9&pc=ABkAAQAZAAAAAAAAAAAAAAAB9h3%2FagEBAQYBQAWwolrQfMKQlGA6nsv9W4Jp8joeTBtY3JvnD1J81a1dLg%3D%3D&pn=2
- When the page fully loads, click the first company to extract it. The rest of the companies should now be yellow.
- Click the next company to train the ParseHub algorithm.
- All 25 hiring companies on the first page should now be extracted, rename this selection to “company” on the left.
Step 2: Scraping Job Details
- Begin by clicking on the PLUS(+) button next to your “company” extraction.
- Click “Relative Select” and click the first company’s name.
- Click the respective job’s position now to close the arrow.
- Redo these two clicks for the next job in the list, to train the algorithm.
- Rename this extraction on the left to “role”.
- Now you can do the same steps for the job summary, redoing the Relative Select but this time closing the arrow on the info.
- Remember to do it on the second listing as well to train the algorithm. Rename this selection to “info”.
- Finally, you can do a third Relative Select, this time on the salary.
- Rename the selection to “salary” on the left.
- You can repeat these steps for other relative details you wish to scrape.
Step 3: Pagination
If you would like to scrape beyond the first page, you will need to use ParseHub’s pagination.
- Firstly, scroll all the way down the website until you see the next page button.
- Click the PLUS(+) icon next to the “page” selection, which is the first on the top left.
- Choose “Select” and click the next button chevron, it should be an A tag.
- Rename this selection on the left to “pagination”.
- Expand the “pagination” data and remove the two unnecessary extractions.
- Click the PLUS(+) button next to your “pagination” selection and choose “Click”.
- This is a next-page button, so choose “Yes” on the popup.
- Finally, choose the number of additional pages you wish to scrape, we chose 2, which means 3 pages of scraped data in total.
Step 4: Start Job Board Scraping
Congratulations, you have successfully made your initial company selections, the relative data extracted from each company listing, and have set up pagination to scrape endless amounts of job data! You are now ready to begin scraping on ParseHub’s servers.
Begin by clicking on the green “Get Data” button. You may choose to Test, Run or Schedule your scrape. We chose Run to get a single parse-through of data, which is 3 pages worth, as specified in our pagination step. If you entered zero on that popup, you will get every single job listing available!
Here is what our data export looked like:
Need help web scraping job-related data or with scraping any other website? You can contact our live support.
You can also check out our tutorial on web scraping Indeed.
Happy Scraping! 💻