Scraping_a_Table.py

Lukman Omotosho

Data Scraper
Data Analyst
BeautifulSoup
Python

Files

Consumer Complaints Analysis
Houses_in_Nigeria.Rmd
Houses_in_Nigeria.html
Nigeria_Agric_Export_Analysis.Rmd
Nigeria_Agric_Export_Analysis.html
Nigeria_Agric_Export_Analysis.pbix
Order and Sales Analysis.pbix
README.md
Retail Strategy Analytics 1.Rmd
Sales_Intro.Rmd
Sales_Intro.html
Scraping_a_Table.py
hospital_patients_analysis.sql
nigeria_houses.sql

Breadcrumbs

/

Latest commit

History

File metadata and controls

30 lines (22 loc) · 814 Bytes
from bs4 import BeautifulSoup import pandas as pd import requests url = "https://en.wikipedia.org/wiki/List_of_largest_companies_by_revenue" page = requests.get(url) soup = BeautifulSoup(page.text, "html.parser") table = soup.find_all("table")[0] rows = table.find_all("tr") table_header = table.find_all("th")[0:7] clean_table_header = [header.text.strip() for header in table_header] df = pd.DataFrame(columns= clean_table_header) for row in rows[1:]: columns = row.find_all("td") if len(columns)>= 7: row_data = [data.text.strip() for data in columns[:7]] length = len(df) df.loc[length] = row_data df.to_csv("Top_companies.csv", index=False, encoding= 'utf-8') df.info() print(clean_table_header) table_header clean_table_header
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Partner With Lukman
View Services

More Projects by Lukman