Thursday, June 6, 2013

Winning An Online Lottery In Just 6 Tries

A Case Study of Logical Error in Online Gambling

Gambling is one of the most profitable business models in the online world. There is no shortage of online betting houses and many state lotteries are setting up their own online equivalents to compete with commercial alternatives (if they fail to regulate them off the market, that is, but that's not today's topic). Let's take a look at a typical online lottery system.

A registered user signs in to the lottery web site and buys some online credits/points with which he will be able to play. He then enters the "Lottery" game, a simple game of trying to guess numbers that are to be drawn from a limited set by computer code on the server hopefully using unpredictable random number generation. For the purpose of this example, let's suppose that 7 numbers are drawn from the set of 1 to 40 (e.g., numbers 3, 14, 15, 22, 28, 33 and 39). In the most simple case, the user can select exactly 7 numbers; if these numbers match the drawn numbers exactly, he gets the maximum payout, but he also gets progressively smaller rewards for matching, say, 6, 5 or 4 of the drawn numbers.

For selecting 7 numbers, the user pays the price of a single bet (one combination of 7 numbers), but the user is also allowed to select from 8 up to 14 numbers, effectively playing with 8 7-number combinations (when selecting 8 numbers) up to 3,432 7-number combinations (when selecting 14 numbers), and being charged accordingly.

We've come across an online lottery application that worked as described, but had a terrible logical error that allowed a user to win jackpot in one out of six tries, each time betting just a single 7-number combination. With a large and growing number of gambling sites, it is possible that the same error might exist elsewhere.

With significant simplification, let's see the request sent by user to the server, and the server-side algorithm.


User's Request

Upon user's selecting and confirming 7 numbers, his web browser sends the following JSON request (len specifies how many numbers there are and numbers is an array with selected numbers):

[len:7;numbers:{2,8,15,17,18,28,32}]

When selecting 9 numbers, the request looks like this:

[len:9;numbers:{1,13,14,19,28,29,30,37,38}]


Server-Side Algorithm

Server-side pseudo-code processing user's request - after having already drawn 7 unpredictable random numbers between 1 and 40 - looks like this:


// len must match the size of numbers array
If length(numbers) not equal len then
    Error("The length of the numbers array doesn't match len.")

// len must be between 7 and 14
If len < 7 or len > 14 then
    Error("The length of the numbers array is out of bounds.")

// User must have sufficient funds
If Not SufficientFunds(len) then
    Error("Insufficient funds for the bet.")

// Count the drawn numbers guessed by user
Count = 0

// For every number in user's set, check if it was drawn
For i = 1 to len {

    // number must be between 1 and 40
    If numbers[i] < 1 or numbers[1] > 40 then
        Error("Invalid number.")
    If numbers[i] in drawn_numbers then
        Count = Count + 1 // We found another match
}

// Log the number of correct guesses
Log("User correctly guessed " + Count + "numbers.")


For the purpose of this study, we don't care how the server then determines the prize. Let's just say that if Count equals 7, the user wins the jackpot.


What is Wrong With This Code?

At the first glance, the code looks okay: it checks if the length of the numbers array matches the len value, it checks if len is within allowed bounds, it checks if the user has sufficient funds on his account to make this bet, it checks if each user-supplied number is between 1 and 40 and it increases the Count value when one of user-supplied numbers is found in the set of previously drawn 7 winning numbers.

So a malicious user can't play without paying first, he can't bet with more than 14 numbers (although if charged accordingly, this would not be a loss for the lottery), and he can't play with "silly" numbers (not that such numbers would give him any advantage either).

What the code above doesn't check for, however, and what makes a huge difference in the mathematical model of this lottery, is that the user is not required to provide unique numbers: his set of numbers can be, for instance, 1, 1, 2, 2, 3, 3, and 4. Or, let's be extreme: 1, 1, 1, 1, 1, 1, and 1. But how does this benefit the user?

The For loop iterates through every user-supplied number and increases Count if it finds that number among the winning numbers. Now, if the user plays with numbers 1, 1, 1, 1, 1, 1, and 1, Count will be 0 if number 1 is not among the winning numbers - but it will be 7 if number 1 is among the winning numbers (which occurs with probability of 17.5%).

This changes the odds of winning the jackpot from 1 in more than 18 million to 1 in less than 6. Quite a bargain.

If you're hosting an online lottery application, it would be a good idea to see whether you code does something similar.

P.S.: The described vulnerability is another case where migrating a business model from physical world to the Internet enables a new type of attack that was previously not possible (i.e., in a traditional paper-based lottery, you cannot cross the same number seven times). It is therefore easy to understand why one can overlook such possibility when writing online lottery code. See this blog post for a similar case.

@mkolsek
@acrossecurity