User blog:Guy Bukzi Montag/Drop Rates Explained: 5th Anniversary (II)

This is part II of a series on the maths behind drop rates.


 * Part I: Practical questions
 * Part II: Extracting probabilities
 * Part III: More than one box (tba)

This part contains the explanations promised in part I and will show you how to extract probabilities (drop rates) from many percentages given in a box info. The boxes granted by the 5th Anniversary event will serve as examples, but the methods also work for other boxes.

Definitions
Every box has two properties:
 * the description ("Contains 4 possible rewards", "Grants 15 Engine cards! At least 1 will be an i6 card!") and
 * the box info (A8BoxInfo.png, or A8BoxInfoAnniversaryBundle.png for 5th Anniversary Bundles) which shows a list of percentages for the items in the box.

Drop rate (probability)
Sometimes the description of a box already contains information about probabilities: If an item is marked as "guaranteed" or described as "At least 1 will be [...]", its drop rate or probability is 100 %.


 * The drop rate of an item is the probability of getting it in a box.

Percentage (expected value)
Contrary to the belief of many players, the percentages shown in the box info do not represent the probabilities of the items. Instead, they show the distribution of items that can be expected if a very large amount of boxes is opened. For example, a guaranteed blueprint in a box with 4 items has a probability of 100 % (because it's guaranteed), but its expected value in relation to the other items in the box is 25 % because it will always be 1 of the 4 items, thus $$\tfrac{1}{4}$$ = 25 %. This 25 % value is the one listed in the box info, not the probability.


 * The percentage or expected value of an item is its amount relative to the other items inside a box, expressed in percent.

Random-only boxes
Sometimes the expected value of an item allows us to deduce its probability directly. For example, if there are 4 random items in a box and 1 has an expected value of 20 % shown in the box info, it's probable that we get 20 of this item if we open 100 boxes. This number can vary because the item is still random and 100 is not a reliable size of a statistical sampling. But if we open 1,000 or even 100,000 boxes, the share of the chosen item will become more and more close to 20 %.


 * The percentage (expected value) of an item equals its probability if there are only random items in a box.

Example: The Specialist Bundle of the 5th Anniversary event "contains 3 possible rewards". None of them is marked as guaranteed.

Info:
 * Blueprint Boxes: 10 %
 * Tokens: 3.33 %
 * Credits: 23.33 %
 * Pro Kit Boxes: 63.33 %

So if we open 100 boxes, we are likely to get more or less 10 blueprints.

Unfortunately, the box info doesn't say anything about the expected values of items subsumed under "Pro Kit Boxes". Apart from blueprints, tokens and credits, there are 9 other items on the list (engines, tech and class A parts). As we don't know if they have additional specific probabilities assigned to them (for example if a mid-tech card is more likely to appear than a hybrid engine), we have to assume that they all have the same probability. This means that each item counting as Pro Kit Box has a probability of $$63.33 % \cdot \tfrac{1}{9} = 7.04 %$$.

That's all there is to say about random-only boxes. There more boxes you open, the more probable it is that you get exactly the number of items defined by their expected values given in the info.

Boxes with guaranteed items
As seen above, guaranteed items distort the percentages of a box. As we already know that their probability is 100 %, we can leave them out and concentrate on what the percentages would have been without them.

Example: The Novice Bundle "contains 2 possible rewards". One of the items in the list (a BMW M2 SE blueprint) is marked as guaranteed.

Info:
 * Blueprint Boxes: 87.5 %
 * Pro Kit Boxes: 12.5 %

In this case we cannot simply say that the probability of getting a Pro Kit Box is 12.5 %. It's true that we can expect roughly 125 Pro Kit Boxes if we open 1,000 Novice Bundles, but this is only the expected value, not the probability.

The reason is: Every box contains 2 items, and 1 out of 2 is guaranteed. So 50 % of the items are guaranteed blueprints, and the other 50 % are divided into 37.5 % random blueprints and 12.5 % Pro Kit Boxes.


 * $$\tfrac{37.5}{50}$$ is the same as $$\tfrac{75}{100}$$, so the real probability of additional random blueprints is 75 %.
 * $$\tfrac{12.5}{50}$$ is the same as $$\tfrac{25}{100}$$, so the real probability of Pro Kit Boxes is 25 %.

If there aren't any additional specific probabilities assigned to the 7 items counting as Pro Kit Box, each of them has a probability of $$25 % \cdot \tfrac{1}{7} = 3.57 %$$.

The same goes for the Expert Bundle: It "contains 4 possible rewards", one of which is a guaranteed blueprint pack.

Info:
 * Blueprint Boxes: 25 %
 * Tokens: 2.5 %
 * Credits: 22.5 %
 * Pro Kit Boxes: 50 %

As 25 % of the items are guaranteed blueprints, the other 75 % would be divided into 2.5 % Tokens, 12.5 % Credits and 50 % Pro Kit Boxes.

Real probabilities would be:
 * Tokens: $$\tfrac{2.5}{75} = \tfrac{3.33}{100}$$ = 3.33 %
 * Credits: $$\tfrac{22.5}{75} = \tfrac{30}{100}$$ = 30 %
 * Pro Kit Boxes: $$\tfrac{50}{75} = \tfrac{66.67}{100}$$ = 66.67 %

You may have noticed that I wrote "would be". That's because all my calculations had the premise that there were no hidden rules or underlying extra probabilities for certain items. But there were. When I opened my Expert Bundles, I realized that all of them had either Credits or Tokens—none of the boxes came without both. After 20 boxes it is quite certain to assume that there is a second guaranteed item which grants "currency".

This changes the expected values:
 * Blueprint Boxes: 25 % (guaranteed, like before)
 * Currency: 25 % (guaranteed, Tokens and Credits)
 * Pro Kit Boxes: 50 %

The new real probabilities are:
 * Tokens: $$\tfrac{2.5}{25} = \tfrac{10}{100}$$ = 10 %
 * Credits: $$\tfrac{22.5}{25} = \tfrac{90}{100}$$ = 90 %
 * Pro Kit Boxes: $$\tfrac{50}{50} = \tfrac{100}{100}$$ = 100 %

If there aren't any additional specific probabilities assigned to the 13 items counting as Pro Kit Box, each of them has a probability of $$100 % \cdot \tfrac{1}{13} = 7.69 %$$.

The general way to extract the probabilites from boxes with guaranteed items is:
 * Take away the expected values of all guaranteed items from the total of 100 % and
 * set the expected values of the remaining random items in relation to the remaining percentage.

Or, as a general formula:

Let
 * $$X$$ be a random item in a box,
 * $$G$$ be all guaranteed items in this box,
 * $$P(X)$$ be the probability of $$X$$ in percent and
 * $$\operatorname{E}[X]$$, $$\operatorname{E}[G]$$ be the expected values (percentages) of $$X$$ and $$G$$ as provided in the box info in percent, then


 * $$P(X) = \frac{\operatorname{E}[X]}{100 - \operatorname{E}[G]} \cdot 100$$

Example:
 * $$X$$: Pro Kit Boxes in a Novice Bundle
 * $$\operatorname{E}[X]$$ = 12.5
 * $$\operatorname{E}[G]$$ = 50 (as 1 out of 2 = 50 % is guaranteed)


 * $$P(X) = \frac{12.5}{100 - 50} \cdot 100 = 25$$

This was the easy part. The next post will show you how to calculate the probability of getting desired items if you open more than one box.

If you have questions: Feel free to comment!