top of page
Untitled

DATA DOUBLE CONFIRM

Data wrangling exercises

  • Writer: datadoubleconfirm
    datadoubleconfirm
  • 52 minutes ago
  • 1 min read

The best way to know whether you enjoy working with data — or to learn how to work with data — is to handle many different forms of messy, real-world datasets, i.e. perform the necessary data cleaning and transformations to derive meaningful insights from them.


This post shares two sample Jupyter notebooks that demonstrate various data transformation steps to achieve clean data for two different data formats:

- JSON output obtained from calling the Developer API of the SingStat Table Builder (Singapore Department of Statistics): https://github.com/hxchua/datadoubleconfirm/blob/master/notebooks/SingStat%20API%202025.ipynb 



The example data relates to Graduates from University First Degree Courses by Type of Course and Sex in Singapore.


JSON output from calling the Developer API of SingStat Table Builder from Department of Statistics Singapore
JSON output from calling the Developer API of SingStat Table Builder from Department of Statistics Singapore


Cleaned dataframe from parsing the JSON output
Cleaned dataframe from parsing the JSON output
Cleaned transformed dataframe ready for plotting chart
Cleaned transformed dataframe ready for plotting chart
.Chart created using matplotlib. Code can be found in the first notebook link above
.Chart created using matplotlib. Code can be found in the first notebook link above

Comments


©2017-2024 by DATA DOUBLE CONFIRM. Proudly created with Wix.com

bottom of page