Reading csv data from Github - Python

April 15, 2019

Today I decided to poke around a little to see if it would be possible to read csv files directly from Github, and the answer is yes. As I have published numerous csv datasets on Github, I thought it would be easier for people to access them without downloading the datasets/ cloning the repository, and as always (or as I'd hoped), there is an answer on the internet.  

 

So, for example, to read in the dataset called 'arrivals2018.csv' hosted on the datadoubleconfirm repository, what we need to do is to get the link to the raw file and then run the code below. The screenshots show how to obtain the raw link.

 

import pandas as pd
url = 'https://raw.githubusercontent.com/hxchua/datadoubleconfirm/master/datasets/arrivals2018.csv'
df = pd.read_csv(url, error_bad_lines=False)

 

 List of datasets published on my Github

 

 

 

 

References: 

https://stackoverflow.com/questions/32400867/pandas-read-csv-from-url

Please reload

Recent Posts

Please reload

©2017-2019 by DATA DOUBLE CONFIRM. Proudly created with Wix.com

This site was designed with the
.com
website builder. Create your website today.
Start Now