There are many factors that investors look at whether to buy or sell their stocks. One item investors look at are the business' financial statements. These financial statements include:
- Income statement
- Balance Sheet
- Cash Flow
Investors can use the numbers from the financial statements to make their own investment decisions.
We are ParseHub, and today we are going to show you how you can scrape financial statements on yahoo and export them into a spreadsheet in just a few minutes!
To get started, you will need a powerful web scraping tool. We think you’ll enjoy ParseHub! Not only is it free to use, but has a suite of features that we think you'll enjoy!
You can download ParseHub for free here
Scraping Financial statements on yahoo
Yahoo finance has all the financial statements for businesses that are listed in the stock market. You can simply search up stock, view their financial statements and select which one you want to view.
As scraping each financial statement is the same, we will teach you how you can scrape each financial statement on its own and how you can scrape each sheet on one project
For this example, we are going to scrape Apples Financial statements, if you want to follow along with this example, you can click on this link to follow along.
So let's get started.
Scrape Income Statement, Balance sheet, and Cash Flow from Yahoo Finance.
First, we will show you how you can scrape the financial statements separately. Since each financial statement has a different URL, you will need to have 3 separate scraping projects.
Scraping Income Statement
- Once you submit the URL for your project, ParseHub will render the webpage. You will now be able to select the first element you’d like to extract.
2. Select the first row under the breakdown column which is “Total Revenue”. It will be highlighted in green to indicate that it’s being extracted.
3. On the left sidebar, rename the selection to Breakdown. ParseHub is now pulling the text.
4. Now, we will select the rest of the breakdown labels in the list which are highlighted in yellow. Click on the second label “Cost of Revenue” on the list to select them all. They will all now be highlighted in green.
5. We will now ask ParseHub to also pull the numbers that are related to each breakdown. To do this, click on the PLUS(+) sign next to your breakdown selection and choose the relative select command
6. Using the Relative Select command, click on the first breakdown label that is highlighted in orange and then on the dollar amount in the column beside it. An arrow will appear to show the association you’re creating.
7. You might have to repeat this process for another product to fully train the scraper. On the left sidebar, rename your selection to the date in this case TTM.
8. Repeat steps 5-7 to pull more data from past years on the financial statements. Be sure to rename them accordingly. Note, when renaming your selections, you cant start with a number. You can name the selection to Sept_30_2020.
9. Once you have everything you want to extract, Click on the green "Get Data" button to begin your project
Scraping Cash Flows
- Once you submit the URL for your project, ParseHub will render the webpage. You will now be able to select the first element you’d like to extract.
2. Select the first row under the breakdown column which is “Operating Cash Flow”. It will be highlighted in green to indicate that it’s being extracted.
3. On the left sidebar, rename the selection to Breakdown. ParseHub is now pulling the text.
4. Now, we will select the rest of the breakdown labels on the list which are highlighted in yellow. Click on the second label “Investing Cash Flow” on the list to select them all. They will all now be highlighted in green.
5. We will now ask ParseHub to also pull the numbers that are related to each breakdown. To do this, click on the PLUS(+) sign next to your breakdown selection and choose the relative select command
6. Using the Relative Select command, click on the first breakdown label that is highlighted in orange and then on the dollar amount in the column beside it. An arrow will appear to show the association you’re creating.
7. You might have to repeat this process for another product to fully train the scraper. On the left sidebar, rename your selection to the date in this case TTM.
8. Repeat steps 5-7 to pull more data from past years on the financial statements. Be sure to rename them accordingly. Note, when renaming your selections, you cant start with a number. You can name the selection to Sept_30_2020.
9. Once you have everything you want to extract, Click on the green "Get Data" button to begin your project
Scraping Balance Sheet
- Once you submit the URL for your project, ParseHub will render the webpage. You will now be able to select the first element you’d like to extract.
2. Select the first row under the breakdown column which is “Total Assets”. It will be highlighted in green to indicate that it’s being extracted.
3. On the left sidebar, rename the selection to Breakdown. ParseHub is now pulling the text.
4. Now, we will select the rest of the breakdown labels on the list which are highlighted in yellow. Click on the second label “Total Liabilities Net Minority” on the list to select them all. They will all now be highlighted in green.
5. We will now ask ParseHub to also pull the numbers that are related to each breakdown. To do this, click on the PLUS(+) sign next to your breakdown selection and choose the relative select command
6. Using the Relative Select command, click on the first breakdown label that is highlighted in orange and then on the dollar amount in the column beside it. An arrow will appear to show the association you’re creating.
7. You might have to repeat this process for another product to fully train the scraper. On the left sidebar, rename your selection to the date in this case Sept_30_2020.
8. Repeat steps 5-7 to pull more data from past years on the financial statements. Be sure to rename them accordingly.
9. Once you have everything you want to extract, Click on the green "Get Data" button to begin your project
Now that you know how to scrape each financial statement separately, let’s show you how you can scrape all 3 statements in one project!
Scraping balance sheets, cash flows, and income statements in one project.
For you to scrape all 3 financial statements, you will need to tell ParseHub to Click on each financial statement then scrape the necessary information. For this example, we are going to start with the balance sheet URL, then click on the income statement and cash flow.
The next steps are completed after you’re done extracting the breakdown labels and dollar amounts.
- On the left sidebar, rename your template to Balance_sheet. This will help us keep our templates organized.
2. Click on the PLUS(+) sign next to your “page” selection and choose the Select command.
3. Select the Cash flow link. Rename the selection to “cash_flow”.
4. Expand your “cash_flow” selection using the icon next to it and delete both “extract” commands under your “cash_flow” selection.
5. Click on the PLUS(+) sign next to your “cash_flow” selection and choose the Click command.
6. A pop-up will appear asking you if this is a “next page” link. Click on “No” and enter a name for this template. We will call it “cash_flow”. You will now be taken to the cash flow page.
7. A new select command will automatically be created. Select the data you want to extract. You can use “Scraping cash flow statements” to help guide you.
8. Once you’re done scraping everything on the cash flow statement. Return to your balance sheet template. Click on the PLUS(+) sign next to your “page” selection under the “balance_sheet” template and choose the select command.
9. Select the income statement link. Rename the selection to “income_statement”.
10. Expand your “income_statement” selection using the icon next to it and delete both “extract” commands under your “income_statement” selection.
11. Click on the PLUS(+) sign next to your “income_statement” selection and choose the Click command.
12. A pop-up will appear asking you if this is a “next page” link. Click on “No” and enter a name for this template. We will call it “income_statement”. You will now be taken to the income statement page.
13. A new select command will automatically be created. Select the data you want to extract. You can use “Scraping Income statements” to help guide you.
Your final project should look like this:
Running Your web scraping project
Now it's time for the fun part, running your web scraping project! Simply click on the green “Get Data” button.
On this screen, you’ll be able to test, run or schedule your project.
For bigger projects, we recommend testing it to make sure it's scraping data properly.
In this case, we will just run it right away.
Closing Thoughts
You can use this guide to help you extract financial statement for any business on the stock market. You can use these spreadsheets to help you make calculations and make valuable investment decisions.
We understand that projects can get quite complex. If you need any help with any of your projects, you can contact our customer support team using our live chat!
You may also be interested in learning How to Scrape Yahoo Finance Data: Stock Prices, Bids, Price Change and more.
How will you use the extracted financial statements?
Happy Scraping