Do you want to sell iPhones on Amazon?

How about some new funky shoes that you don't think are available on the market yet?

You can use web scraping to collect product information into an Excel file, for pricing analysis.

With the free web scraper ParseHub and the tips in this article, you don't have to spend any more time copying and pasting pricing data from the web.

In this tutorial you will learn how to:

  • Tell ParseHub to search for products in three different toy categories.
  • Get the price, description, and reviews for each product you searched for on Amazon.
  • BONUS: Tell ParseHub to search for 40 toy brands on Amazon automatically. Use Mr. Data Converter to convert the list of brands from Excel to JSON and easily add the list into the ParseHub extension for automatic searching

Search for products in 3 categories on Amazon

Open ParseHub and Amazon

  1. Open the ParseHub desktop app. You can download it for free here.
  2. Open the following link - http://www.amazon.com/ in the browser. If you are in a country outside of the United States, you may be prompted to travel to a different home page, like amazon.ca in Canada. If that is the case, just click on the yellow button to travel to your country’s home page.
  3. Click ParseHub’s ”New Project“ button, then the “Start project on this URL” button, to create a new project from this page.

Enter the three product categories into ParseHub

Let's search for three different types of doll brands on Amazon.

  1. Open the "Settings" tab of the ParseHub project.
  2. In the "Starting Value:" text box enter your keywords in JSON format, as seen in the line of code below. The keywords can be whatever you would like to search for. Even the list name "keywords" can be renamed to whatever you want, such as "terms”, or "brands".

{"keywords":["Barbie Dolls","Bratz Dolls","Disney Dolls"]}

Create a loop to search through all of the categories

  1. Click the "plus" button next to the command "Select page(1)" to open the command menu.
  2. Open the "Advanced" menu and add the Loop tool.
  3. In the "for each" text box leave the text as item. You can change this text to anything you want, such as "brand" or "term".
  4. In the "In" text box enter the name of the list. In this case the name is keywords, so type that in without quotation marks.
  5. Click on the "x" button next to the command Empty selection1 (0)" to delete it.

Select the correct search form

  1. Click on the "plus" button next to the Loop that you just added, and open up the "Advanced" menu to choose the command "Begin New Entry". This separates the data by each brand in the JSON and CSV files that ParseHub will give you.
  2. Name the list of new entries whatever you want, I called it brands.
  3. Click on the "plus" button next to the Begin New Entry command, and add the Select command.
  4. Click on the Amazon search bar to select it. This automatically adds an Input command.
  5. In the Input type drop-down menu and change it from "text" to "expression".
  6. In the input text box, write item with no quotation marks.
  7. Lastly, I went back to the "Select selection1" command and renamed it "search_bar"

You have just told ParseHub to select the search box and enter each keyword, such as "Barbie Dolls", into the search bar one by one.

  1. Hold down shift and click on the "plus" button that appears next to the "Input item" command.
  2. Add another Select command, and click on the magnifying glass search button next to the search bar to select it.
  3. Rename selection1 as search_button
  4. Add a Click command by clicking on the "plus" button next to the Select command.
  5. Choose "Create New Template" and enter the name of the new template that you would like to create, like results.
  6. Click on the "Create New Template" button

This tells ParseHub to click on the button and navigate to the list of products for each different search.

Scrape all of the products for each brand

  1. To open the right search results, click on the slider in the top right of the ParseHub toolbar to switch it from "Select" mode to "Browse" mode. Now you can navigate Amazon as if you were using a regular browser. Search for "Barbie Dolls". Now you are on the type of page you would like to scrape.
  2. Now let's select a few products on the page. First, click on the slider again so that you switch from "Browse" mode to "Select mode".
  3. There will automatically be a Select command added. Select the name of the product by clicking on it. Scroll through the rest of the page to make sure ParseHub has selected all of the products. If not, keep clicking on the ones that have not been selected.
  4. The rest of the products on the page will be highlighted in yellow. Click on a second one to select them all. This will automatically add three new commands: "Begin New Entry in selection 1", "Extract name" and "Extract url". Rename these however you would like.

Scrape the price, reviews and description of all the products

  1. Click on the "plus" button next to the Begin New Entry command and choose the Click command.
  2. In the text box write details and click "Create New Template".

This tells ParseHub to click on each product and go to the corresponding details page for each product.

Scrape the prices and customer reviews for each product

  1. Click on the "Create New Template" button. This will automatically take you to the first product page.
  2. Add a Select command and click on the price of the product.
  3. Scroll down until you see the "Customer Reviews" section.
  4. Click on the "plus" button next to "Select page".
  5. Choose the Select command and click on the percentage beside the 5-star column. Name it "star5" since your name can't start with a number.
  6. Click on the "plus" button beside the "Select page" command again, and click on the 4-star percentage, naming it star4.
  7. Do this for the 3, 2 and 1-star percentages as well, remembering to click the "plus" button on the "Select page" command and not a different command.
  8. Add one more Select command and click on the product Description to extract it as well.

You have now told ParseHub to extract the price, description, and the ratings of this doll. That means ParseHub will select the price, description, and ratings of every doll on the first page after you search for one of your search results. You will also have the URLs to the 5, 4, 3, 2, and 1-star reviews, if you would like to visit them.

Run the project and download your results

  1. Click on the "Get Data" button.
  2. Click on "Run" and "Run and Save".
  3. Wait for ParseHub to collect the data for a few minutes. When you see the CSV and JSON buttons appear click on one of them to download your data in Excel or JSON format.
  4. You can also connect to our API by checking out this API reference.

BONUS: Convert categories from Excel into JSON data for easy input

You may want to search through more than just 3 categories on Amazon. What if you had a thousand 10 thousand words that you would like to search for, and you happen to have them in an excel spreadsheet?

You can easily convert any data in Excel into JSON using Mr. Data Converter.

Let's use Mr. Data Converter to convert your hypothetical list of categories into JSON.

  1. Open Mr. Data Converter.
  2. Open your Excel file that has all of the keywords and copy them.
  3. Paste the copied keywords into Mr. Data Converter.
  4. Select "JSON - Column Arrays" from the dropdown.

Now you are ready to take this information and paste it into ParseHub. ParseHub will search for all of the 40 keywords that you are about to enter.

Search for 40 categories on Amazon

  1. Copy the JSON output from Mr. Data Converter.
  2. Open the "Settings" tab of your ParseHub project.
  3. In the Starting Value text box paste the keywords.
  4. That's it! Your project will run exactly the same way as it ran at the beginning of this tutorial. The only difference is the number of categories ParseHub will search through.

Entering thousands of search terms into a web scraping tool has never been easier. You can do the same with URLs and multiple search values if you have more than one search box in the form.

If you need any help setting up a similar type of project just reach out to us at support[at]parsehub[dot]com.

[This post was originally written on December 11, 2015 and updated on August 9, 2019]