DATA DIVE DAYS

Process and product of various data science tasks— from data collection, data preparation, data visualization, to basic statistical analysis and modelling. Datasets for practice available.

Selected as Top 100 Data Science Resources for 2018

on MastersInDataScience.com

November 10, 2019

Was surprised to see my post trending. Since it caught on, I would cross-share it over here.

While the post was meant to answer a frequent question I got on whether data scientists will be automated away, it was actually more intended to be an outlet for me as there was...

October 30, 2019

While I mainly host my datasets on my Github repository, I have also cross-shared some datasets on data.world as the platform is integrated with quite a couple of other tools. And also, data.world is more user-friendly for users who might not want to dabble into Github...

June 23, 2019

I got to chance upon the emoji python package that allows printing emojis in Python and decided to collect some data relating to emojis listed on the emoji cheat sheet. There were associated terms/ descriptions (i.e. alternative names) for each emoji and they were scra...

April 24, 2019

There are various data items, such as channel name, title of video, and number of views, likes, dislikes, and comments, that can be retrieved from using YouTube Data API v3. This is a free service however there are limitations on the number of requests we can make. Sho...

April 15, 2019

Today I decided to poke around a little to see if it would be possible to read csv files directly from Github, and the answer is yes. As I have published numerous csv datasets on Github, I thought it would be easier for people to access them without downloading the dat...

October 9, 2018

In addition to BeautifulSoup, selenium is a very useful package for webscraping when it involves repeated user interaction with the website (eg. to click to select options from certain dropdown list and submit) to generate a desired output/ result of interest. Selenium...

July 19, 2018

​As Singapore's National Day is coming up, I decided to put together a data viz to better understand how much we have changed across the years. National Day is the time to celebrate the birth of our nation, but also a time to look back at our growth (in construction, e...

July 7, 2018

Having data in tables in PDF is probably one of the most agonizing thing for users. It feels as if the data is there but not there. In this post, I cover two resources that allow us to extract data into excel/ csv format. I made use of the U.S. Complement to the End of...

July 3, 2018

It's always exciting when the data visualization or analysis you did is used to push forward a movement or a cause. Most would agree this is probably one of the greatest satisfaction we derive as data scientists (again, I'm using this title loosely). So, I got to...

May 11, 2018

This is Part I of a two-part post. Part I talks about scraping data from SGDI while Part II outlines the process of presenting the data using Tableau.  

The code builds on the one covered in a previous post on how to use Beautifulsoup in Py...

Please reload

Recent Posts

December 3, 2019

Please reload

Archive

Please reload

Tags