Bayesian Learning Project - Cheaters Analysis by Uriel Chareca

Cheaters Analysis using Bayesian Techniques By Uriel Chareca Abstract Ubisoft (in particular Massive Studios), one of the biggest video game developer companies, provided a vast data set of a game analytics. By using diverse Bayesian techniques, different models were implemented to describe the fight outcome of this role player game. The goal is to find a model that fit best our user behavior, and could help identify the ones that are cheating. 1-Dataset and Problem definition As part of Ubisoft’s interview process, they developed a skill assessment exercise that consisted in analyzing a vast fake log file and with barely any information on how the game functions, and see if you can find evidence of cheating. I only knew that the fake game was named Creepy Crawlspace Crawlers and as the name suggests is a re-imagination of the classical dungeon crawlers from back in the beginning of computer games. Therefore, according to information online of this old type of games, users are supposed to wonder around level maps running mission that allows them to fight different monster, gaining skills as experience, wisdom and strength and as well gold that could be used to buy and sell weapons. Fights are ruled as a combination of die throws, and comparison between user and monster qualities, but no official information was shared on how any fight, or any other part of the game really works. The database provided includes the log of 56189 records, for 100 users and from 5 main event categories:  Completed missions  Level Up  Fights  Transactions of Buy and Sell Originally was coded as Javascript Object Notation but i manually adjusted to a .csv format to make the work easier at the usual statistical scripting languages. Each category includes a diverse number of variables to analyze; therefore a specific analysis by event was made, with different scopes and aims regarding each situation. Variables include information about the status at each timestamp of the Mission in place, about the User (its Position, Level, Experience, Wisdom, Gold and Strength), the carrying weapons (only 2 weapons possible at a time, and they information as Level, Strength, Wisdom and Price) and the fighting Monster (its Type, Level, Strength, Wisdom and Experience). From this analysis many strategies were found as significative of cheating and singular cheating users identified by a review of outlier patterns and clustering; for instance strategies were found as gold gathering via reselling thousands of times specific objects and re run specific levels as a way to level up user skills.

One idea thought at the moment, and the core of this extended paper, was to research the way the fights were run, and by so try to find users that are consistently winning when their expected behavior was to lose. Focusing only on the dataset that belongs to the fight records (approximately 20’000 logs entries) and from its initial exploratory analysis, no clear separation of Win/Loss fights could be seen in any mix of variables (see figure 1) .Also as data was so mixed together there was no clear identification of variables or patterns that might define a Win or a Loss scenario, and by so found strange behavior indicating a cheat. Therefore a thought of a classification algorithm using supervised learning clustering as a decision tree could work best. Nevertheless its misclassification rates remains significance with an average of 40% of observation misclassified both significative as unexpected Wins and unexpected Loss (see figure 2).

Figure 1: WIN fights in light blue, LOSS fighs in red

Figure 2: Example of decision tree for Monster “Evil Lizard”

2- Introduction to a likelihood based model separation As all criteria by available variables remain non significative, it could be possible to model the probability of winning as random variable given the data available. A Bernoulli distribution, based on a single parameter p that models the probability of success/failure, seems as the best fit. The general success ratio was approximate 50%, but it includes a clear glitch in the system on monster “Smug Hobbit” (0% able to win), that needs to be ruled out of the analysis. Figure 3: Success Ratios by Monster

Fitting this parameter p (“θ”) to the ratios seen, we can calculate for every user, what is the probability of observing its behavior, given our model. When plot its likelihood we can identify the outlier users clearly having a better performance than expected. Assuming independence, the expected performance (‘likelihood’) is calculated as the product of observed performance probabilities by

monster. Ex: Prob (User1 wins 7/10 times vs Monster1) * Prob(User1 wins 4/8 vs Monster2)*… In this case the likelihood function is used as a deterministic function of our parameter θj (each monster defined probability of winning) and Xj is the set of number of successes and failures under each monster for user i. Likelihood I :

p(X1,X2,…,Xn|θ) = p(X1|θ1)… p(X7|θ7) = θ1S1(1- θ1)F1… θ7S7(1- θ7)F7

3- Bernoulli Model Under a Bayesian approach, our first model for the response variables, Xi=1 (if the ith fight is a win) is a function of parameter θi independent and identically distributed as Bernoulli. The posterior distribution would come from the multiplication of a prior by its Bernoulli likelihood. Using a conjugate prior, leads to the same distribution family on the posteriori distribution. In this case we assume a Beta prior with parameters alpha =1 and beta=1, an accepted initial prior for a probability between 0-1. It can be proven that the posterior distribution of each parameter θi would be a Beta with parameters alpha new=prior alpha+ number of successes and beta new= prior beta + number of failures. (see figure 4) Figure 4: Posterior Beta, Prior Beta

Table 1: Posterior distribution of θi evil_wizard giant_squid grumpy_dragon nasty_dog reanimated_skeleton rotten_zombie smelly_gnome failure 995 1119 1212 1426 1271 1267 1241 success 1442 1589 1559 1678 1668 1652 1647 post beta 996 1120 1213 1427 1272 1268 1242 post alpha 1443 1590 1560 1679 1669 1653 1648 post mean 0,592 0,587 0,563 0,541 0,567 0,566 0,570

From this original dataset, 2 outliers were found, even if the company actually stated afterwards there was no set fix on the fights processes. On figure 4 below, the plot of the histogram of the Loglikelihood by user helps to identify 2 users with a separated and smaller likelihood, and therefore assumed “strange behavior”. User 34 has in average a +22% better than expected performance, while User #71 is in actually suffering a 16% worse performance than expected. In fact this user 34 was having a

significant performance as it was consistently gathering gold and skills via a cheating strategy, and therefore also increasing his ratios of winning. Figure 5: Histogram of Loglikelihood

Table 2: Detail of Outliers (original data) evil_wizard giant_squid grumpy_dragon nasty_dog reanimated_skeleton rotten_zombie smelly_gnome smug_hobbit GRAL 59% 59% 56% 54% 57% 57% 57% 0% USER 34 67% 61% 80% 55% 62% 83% 76% 0% USER 71 38% 72% 46% 41% 51% 35% 52% 0%

To enhance the analysis a new dataset was created, replicating the distributions of the observed variables. As each variable only accepts a wide range of positive values, and are usually skewed to the left with higher concentration on lower values, a gamma distribution seems as the best fit. By method of moments, each variable was fit to a gamma with parameters alpha (”shape”) =mean^2/variance and beta (”rate”) =mean/variance (see figure 6). Figure 6: Fit of variable “Experience” to a Gamma (shape=0.62,rate=0.000011) in Black observed values, in Red fitted distribution

For future validation, 100 users were created with a random sample of 100 fights each. The first 90 users fight outcome came from a Bernoulli distribution with parameter p=probability of success of 57% (general average observed, excluding the smug hobbit monster), while the remaining 10 users were set coming from a similar distribution but with 25% more probability of winning (p=57%*1.25=71.25%). Running a number of draws from the posterior distribution of the parameters θi and response vector, and then averaging its likelihood creates a good graphical view of its distribution. As you can see on figure 7, cheaters are clearly separated from the other users, the boxplot clearly specify what could be an acceptable outlier (“cheater”) limit to set for 1.5 it interquartile range (figure 8). All fixed cheater users were correctly identified plus 2 “normal” users. Figure 7: New Log likelihood histogram

Figure 8: Box Plot of New data

Table 3: Outlier Users

Table 4: Winning probabilities by User Expected 57 81 91 92 93 94 95 96 97 98 99 100

evil_wizard giant_squid grumpy_dragon nasty_dog reanimated_skeleton rotten_zombie smelly_gnome 0,59 0,59 0,56 0,54 0,57 0,57 0,57 0,73 0,71 0,56 0,62 0,80 0,42 0,71 0,62 0,54 0,50 0,83 0,86 0,53 0,36 0,75 0,75 0,78 0,67 0,83 0,69 0,71 0,70 0,79 0,75 0,56 0,64 0,57 0,70 0,61 0,80 0,85 0,69 0,67 0,55 0,60 0,85 0,81 0,82 0,73 0,67 0,63 0,45 0,86 0,63 0,55 0,60 0,71 0,67 0,65 0,72 0,67 0,76 0,85 0,59 0,64 0,75 0,47 0,87 0,79 0,78 0,71 0,71 0,38 0,75 0,85 0,77 0,63 0,60 0,73 0,75 0,86 0,60 0,67 0,60 0,42 0,67 0,80 0,61 0,25 0,73 0,63 0,80 0,76 0,50

3- Probit Regression Model A second model to be introduced is to estimate the response 1/0 (Win/Loss) via a regression model, ignoring whose monster is fighting. An initial test using a logistic regression of the original data, gives the following results, with no great fit, and many variables non significative under usual significant criteria. A reduced model still shows a poor fit with barely any AIC reduction. Therefore I decided to keep all variables to further analysis, even if there might be a clear correlation between them, I found interesting to identify the proper coefficients by each variable value. Table 5: example of logistic regression on original data

The second Bayesian model included was via a Probit regression. This model estimates the Probability of Win: P(Yi=1/Xi)=φ(xi*β), where φ is the cumulative normal distribution function of the standard normal distribution. We set up a variable named ui that is distributed as a normal (xi*β,1), and Yi would be 1(Win) if ui>0 and 0 otherwise. In this case I aim to have a considerable set of draws of the β coefficients (the θi in this case), so we can get from each θi distribution, the converged values into the respective likelihood function. This table can be generated via Gibbs sampling. Gibbs creates a sequence of estimations of the u i and βi via iterating between the full conditional posteriors of P(β/u,y), a multinormal linear regression, and the p(ui/β,y) that is distributed to a truncated normal (xi*β;,1) depending if yi=0 or 1. A posteriori of p(ui/β,y) is proportional to the multiplication of p(y/β,u) and p(u/β). Starting from initial prior of ui equal to 0.5 if Win and -0.5 if loss, and the βetas = 0, below the distribution of the different parameters and the confirmation of its convergence values. Figure 9: example of distribution and convergence plots

Table 6: Converged Beta values

Table 6 lists the converged values of each coefficient of beta related to the individual variables. But it doesnâ&#x20AC;&#x2122;t seem consistent with the logistic regression as before we have an intercept in consideration and now it is a different model Nevertheless worth review that as is considered as statistical significant a variable with a ratio of mean/sd >2 (given normal distribution, and 5% significance), now almost all variables are significant in this model. (Exemption of MonsterStrength and MonsterWisdom). From this distribution of beta coefficients is possible to run many draws along with expected outcome of the distribution specified above and create a new likelihood table in order to search for outliers. The likelihood of the new data given this model could be written as:

Where Ď&#x20AC;(xi) is the mentioned normal CDF. Therefore the log likelihood via user could be calculated via the following formula.

Figure 10: Log Likelihood of Probit Model for new data

Figure 11: Boxplot of Probit Model for new data

Table 7: Outliers found on Probit Model

Even after 1000 iterations, and averaging the loglikehood values, the last figures shows that the users are closer to each other and, now it is harder to identify outliers and cheaters than before. For instance only 1 out of the 10 sure cheaters was found (user 93), and even has 4 random users as outliers. When we review the probabilities of winning of each user, the outliers selected do not present a clear higher probability, as before, in comparison to the expected values. (review table 8 vs table 4) Table 8: Probability of Winning of Outliers Expected 19 28 41 85 93

evil_wizard giant_squid grumpy_dragon nasty_dog reanimated_skeleton rotten_zombie smelly_gnome 0,59 0,59 0,56 0,54 0,57 0,57 0,57 0,67 0,73 0,50 0,55 0,67 0,40 0,44 0,41 0,61 0,56 1,00 0,64 0,62 0,63 0,64 0,47 0,46 0,60 0,21 0,69 0,69 0,50 0,74 0,77 0,75 0,56 0,38 0,71 0,46 0,75 0,81 0,80 0,63 0,73 0,85

4- Conclusion Two Bayesian model have been presented, but only one clearly identify all the cheaters, as is easier to set a threshold on the log likelihood distribution of the data given the parameters. The first simpler (single parameter) model of a Bernoulli distribution of the probability of winning by monster provides an easier to understand model along better results. In the Probit regression model, data is grouped together which leads too many mistakes on user selected as outliers, and no clear linkage of the probability of winning with the outlier selection. Nevertheless this is just an introduction of how descriptive statistics, using Bayesian inference could be used for game analytics, specifically on cheaters or any other labels desired identification. The opportunities for further research on the topic are immense, and it is a new and exciting field for statisticians to work on.