Web scraping using Beautifulsoup on embedded html - Process - Python

November 2, 2019

In this webscraping attempt, I want to get data on countries, sites and categories of sites in one table. One challenge I faced is to get the data for the sites to correspond/ match with the countries that are tied to them. The sites can be extracted through parsing the li header, but in order to get the sites tied to the specific countries, I will need to find a way to loop through the countries listed. That's where I noticed the  sites for each country is also embedded under the div header.

 

 

div.find_allclass['class'].find_all

 

 

 

 

The entire notebook can be found below or here on Github.

 

 

 

If you would like more exercises, check out my previous 

Please reload

Recent Posts

Please reload

©2017-2019 by DATA DOUBLE CONFIRM. Proudly created with Wix.com

This site was designed with the
.com
website builder. Create your website today.
Start Now