A Free and Powerful Web Scraper
For this project, we will use ParseHub, a free and powerful web scraper than can extract data from any website.
Additionally, we will also extract data from Amazon, using ParseHub to interact with the search bar, perform a search and scrape content loaded dynamically in the search results page.
Make sure to download and install ParseHub for free before getting started.
Now, let’s get started with our project.
- Install and open ParseHub. Click on “New Project” and enter the URL you will be scraping data from. In this case, we will scrape data from Amazon.com. The page will then render inside the page.
- Go back to your first selection and rename it “search_bar”.
- Now let’s set ParseHub to click on the search button and load the search results page.
- Click on the PLUS(+) sign next to your “page” selection and choose the “Select” command.
- With the select command, click on the Search Button to select it. It will be highlighted in green to indicate that has been selected. Rename your selection to “button”.
- Now click on the PLUS(+) sign next to the “button” selection and choose the click command.
- A pop-up will appear asking you if this a “next page” button. Click on “No”, rename your template to “results_template” and click on the “Create New Template” button. The search results page will load inside the app.
Want to setup up ParseHub to search through a list of keywords? Check out our guide on how to enter a list of keywords into a search box.
Extracting Data from a Search Results Page
Let’s now setup ParseHub to extract more data from the Amazon search results page.
- With the select command created by default, click on the name of the first non-sponsored product on the page. It will be highlighted in green to indicate that has been selected.
- Now click on the second product name on the page to select them all. They will now all be highlighted in green. Rename your selection to “product”.
- ParseHub is now pulling the name and URL for each listing on the page.
Want to learn how to scrape even more data from Amazon, such as pricing and product details? Check our guide on how to extract product data from Amazon.
To extract the data you have selected, click on the green Get Data button in the left sidebar.
Here you will be able to test, schedule or run your scrape job.
In this case, we will run it right away. Once your scrape is complete you will be able to download it as a CSV or JSON file.
What website will you scrape first?