Part I covers the use of Monte Carlo simulations to estimate the likelihood of the number of cases for COVID-19/ 2019-nCov for the next five days.
[From Wiki] Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle.
100 simulations were ran using 20 days of data, from 23-Jan-2020 to 11-Feb-2020, with two different assumptions, one based on current probability distribution of observed data and another based on normal distribution. The simulated outcomes are plotted and we can see a range of possible outcomes. This can give us a sense of how likely an outcome is to occur (with respect to others) by looking at the number of paths that ended with the outcome. If the actual data (observed cumulative count of COVID-19 cases) has a low simulated probability, it can indicate that the event is unlikely and perhaps more attention should be placed on managing the situation/ spread of the virus because it is not within the 'expected range'. With newer data, we can update and run the simulations again to take in the new information.
The raw data can be found here.
Jupyter notebook can be found here.
We repeat the analysis using two more days of data. We can see some differences between 100 simulations and 1000 simulations (probably due to high deviation/ spread of data on daily counts).
[/Edited on 18 Feb 2020] Actual observed data shows that the Monte Carlo simulations did successfully capture it as one of the possible paths. With more days of data, the simulations will be more steady (as can be seen from comparing 20 days of data vs 22 days of data).