In this tutorial, you will learn how to scrape large amounts of Flipkart product data with our free web scraper, ParseHub.
Flipkart was founded in 2007 by former Amazon employees, and is India’s second-largest e-commerce company, right after Amazon India. In fact, Flipkart generates over 40% of India’s e-commerce industry and sells a similar volume of electronic goods, as Amazon in India. We also have a tutorial on scraping Amazon, which you may find interesting as well. As of 2022, Walmart owns a 77% ownership in Flipkart, bringing the platform’s valuation up to over 37 billion dollars. After this tutorial, you will be able to scrape all sorts of products from Flipkart, from books to mobile phones.
Let’s begin scraping Flipkart!
Step 1: Scraping Products
- Begin by opening the ParseHub software on your PC, Mac, or Linux system.
- Click the “New Project” button and enter the Flipkart URL you wish to scrape, we will be scraping phone cases with this URL: https://www.flipkart.com/search?q=phone%20case&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off
- Once the page loads, click the first product’s name to extract it. Click the next product’s name as well.
- Scroll to the next row, and click the first product’s name there too, to fully extract all 40 products on the first page. Sometimes you need to click multiple products to train the algorithm.
- Rename this extraction on the left to “product”.
Step 2: Scraping Prices and Ratings
- Begin by clicking the PLUS(+) button next to your “product” selection from before.
- Choose “Relative Select” in the dropdown and click the first product’s name.
- Move the arrow to the respective product’s price, and click the price.
- All 40 prices should be extracted, rename this selection to “price” on the left.
- Do another Relative Select on your product selection, and click the product name again.
- Move the arrow to its rating and click to close the arrow on its rating.
- Redo the clicks for the next product to train the algorithm.
- Rename this selection on the left to “rating”.
Note: if the first product does not have the rating, you will have to do the relative selection on the first product that does!
Step 3: Pagination
To scrape multiple pages or every single page, we need to use ParseHub’s pagination.
- Start by scrolling down the Flipkart page until you see the next page button.
- Click the PLUS(+) button next to your “page” selection, not to be confused with your “product” selection, and choose “Select”.
- Click the next page button to extract it.
- Rename this selection to “pagination” on the left, expand it and delete the two extractions.
- Now click the PLUS(+) button next to your “pagination” selection and choose “Click”.
- This is a next-page button, so choose “Yes” on the popup.
- Finally, choose the additional amount of pages you wish to scrape, we chose 2 for our example to scrape 3 pages in total. Enter 0 if you want to scrape every available page!
Step 4: Bypassing Blocks
Before you begin scraping, you need to enable ParseHub’s IP Rotation.
Note, this is a paid feature of ParseHub, you can view our plans here.
- To enable IP Rotation, click the settings cog on the top left of ParseHub.
- Choose “Settings” to open up the settings menu.
- Tick the “Rotate IP Addresses” checkbox.
- Agree to the IP Rotation popup.
- You’re now ready to scrape without blocks!
Step 5: Starting Your Scrape
To begin scraping on ParseHub’s servers, click the green “Get Data” button on the left pane.
You can Test, Run or Schedule your scrape. In our instance, we chose Run to scrape our 3 pages of products a single time; as specified in our pagination step!
This is what our data export looked like:
Need more help with e-commerce web scraping, or scraping a specific website?
Feel free to contact our live support.
Happy Scraping! 💻