
Toughest part in a data science project
Was surprised to see my post trending. Since it caught on, I would cross-share it over here. While the post was meant to answer a...
Nov 10, 2019

Web scraping using Beautifulsoup on embedded html - Process - Python
In this webscraping attempt, I want to get data on countries, sites and categories of sites in one table. One challenge I faced is to get...
Nov 2, 2019

Challenges at the start of a data project
As with my other posts, I am using the title "data scientist" loosely because titles are not consistently used across the industry so to...
Aug 11, 2019

Emoji data
I got to chance upon the emoji python package that allows printing emojis in Python and decided to collect some data relating to emojis...
Jun 23, 2019


Sample size and correlation
A typical question faced is how much data is considered enough. The answer is it depends. First and foremost, we need to know what...
May 7, 2019

Reading csv data from Github - Python
Today I decided to poke around a little to see if it would be possible to read csv files directly from Github, and the answer is yes. As...
Apr 15, 2019
Text frequency analysis - Process - R
This is a tutorial to get the frequency distribution of words used in a chunk of text and is a simpler alternative to a more elaborate...
Dec 15, 2018

Web scraping using selenium - Process - Python
In addition to BeautifulSoup, selenium is a very useful package for webscraping when it involves repeated user interaction with the...
Oct 9, 2018

Crisis Risk Dashboard - Process - Tableau
This is Part I of a two-part post. Part I outlines the process of presenting the data using Tableau and Part II delves into insights from...
Sep 1, 2018

Singapore's Heartbeat
​As Singapore's National Day is coming up, I decided to put together a data viz to better understand how much we have changed across the...
Jul 19, 2018