Selenium can be an effective method of extracting online data. But using Selenium can be intimidating if you don’t know what you’re doing or where to start!
While there is no perfect method for web scraping as everyone will have their own preference, we’ll teach you how to web scrape without selenium. You’ll need an automated web scraper to do the job.
But first, let’s explain what is Selenium.
What is Selenium?
Selenium is an open-source web-based automation tool. Selenium primarily used for testing in the industry but It can also be used for web scraping. You can test your web application in many ways like:
- Permit it to tap on buttons
- Enter content in structures
- Skim your site to check whether everything is "OK" and so on.
While using Selenium is a great skill to learn, we want to offer another skill that we think you’ll enjoy and that’s using an automated web scraping tool.
While there are many web scrapers out there, we think you’ll enjoy ParseHub!
It’s free to download, easy to use, cloud-based scraping and powerful.
You can download ParseHub for free
Web scraping without Selenium
For this example, we are going to show you an easy example of what web scraping can do. For this example, we will be scrapping 4K TVs sold on Newegg.
- First, make sure to download and install ParseHub. We will use this web scraper for this project.
- Open ParseHub, click on “New Project” and use the URL from Newegg’s result page. The page will now be rendered inside the app.
Scraping Newegg Results Page
- Once the site is rendered, click on the product name of the first result on the page. The name you’ve clicked will become green to indicate that it’s been selected.
- The rest of the product names will be highlighted in yellow. Click on the second one on the list. Now all of the items will be highlighted in green.
- On the left sidebar, rename your selection to product. You will notice that ParseHub is now extracting the product name and URL for each product.
- One the left sidebar, click the PLUS(+) sign next to the product selection and choose the Relative Select command.
- Using the Relative Select command, click on the first product name on the page and then on its listing price. You will see an arrow connect the two selections.
- Expand the new command you’ve created and then delete the URL that is also being extracted by default.
- Repeat steps 4 through 6 to also extract the shipping costs product image, brand image. Make sure to rename your new selections accordingly.
Running and Exporting your Project
Now that we are done setting up the project, it’s time to run our scrape job.
On the left sidebar, click on the Get Data button and click on the Run button to run your scrape. For longer projects, we recommend doing a Test Run to verify that your data will be formatted correctly.
After the scrape job is completed, you will now be able to download all the information you’ve requested as a handy spreadsheet or as a JSON file.
Learn more about web scraping
If you want to learn more about web scraping, we offer free online web scraping courses! You’ll be able to learn more about web scraping and get a certificate of completion once you’re done!
Happy scrapping!