Recently a friend of mine has been obsessed over lottery. He found a dataset of past lottery winning combinations and somehow thinks he can find the logic behind these combinations.

While I don’t share his wild and bold imagination, I think it is great to show just how unpractical it is people betting for lottery. Therefore, I obtained the 4D result dataset from Sports Toto Malaysia Sdn Bhd. It contains 25 years of data from 1992 up to 2017. It records every combination that was picked in each draw, including the first prize, second prize, third prize, 10 special prizes and 10 consolation prizes.

As we can see in the dashboard, some combinations tend to appear more times than others. Is it by coincidence? Or is there actually some logic behind this?

Let's break it down one by one.


From the TreeMap chart, we see the combination 4267 does appear more times than most of the combination pool, with a total appearance of 26 times. Strange. Toggling the Frequency filter control below the TreeMap to show combination with lower frequency, we will see that most combinations only have a low to average frequency. Is 4267 really a favored number? Are we actually onto something?

Now looking at the Bar Chart, let’s toggle the Year filter control to reveal highest frequency in recent 5 years. It shows that from 2012, the combination 4267 appeared 8 times. And notably, it appeared 6 times in year 2014 alone. Rigged, you say? Perhaps. Now, we toggle the filter control again to find out the most frequent combinations in year 2017. What do we see? Or rather, what do we not see? Because now, 4267 is not in the bar chart anymore!

You don’t believe me? Let’s toggle the Top X Frequency filter control at the right. Moving the slider to the right gives us more bars in the chart. Let’s move it to the right, and see if 4267 is hiding in the back. Do you see it? Because I certainly don’t.

Maybe this is because of the fact that it is now only August 2017 and 4267 might still come back in the next several months. Well, if you think so, let’s move on to the Highlight Table.

This table shows the distribution of every combination in an organized tabular format. Darker color indicates higher frequency. Note that, this table can be filtered using the same filter control as the TreeMap chart. Do you see any pattern? No? Perhaps you need to toggle the frequency filter for a bit to include combinations with lower frequency. Does it look like they follow some kind of pattern? Does a group of frequent combinations cluster together? Because to me, the distribution looks quite even. There is no tendency for certain combinations to appear in a specific cluster or group.


As a conclusion, I even calculated the probability to win in a Sports Toto 4D lottery. There is a total of 10000 possible combination in each draw, and there are only 23 winning combinations. If you bet using any random number in any draw, the probability that you can win any of the prizes, is only 0.23%, which is less than one (1) percent!  

So, why do you even try?

Disclaimer: This project is strictly educational. The results obtained and examples used in this project should not be accountable for your own behavior. We do not encourage the behavior of gambling and therefore will not be responsible of your own action.