“Everybody believes in the exponential law of errors [i.e., the Normal distribution]: the experimenters because they think it can be proved by mathematics; and the mathematicians, because they believe it has been established by observation.” – Whittaker and Robinson
What you theorize is not always what you get
Distribution crystalizes the core of statistics and probabilities intellectual challenge. It is something that requires a high level of abstraction and that has to be based on factual grounds. When reading this post, always remember that theory may help you in general but can fail you in some particular case. This is specifically true for normal distribution which is the easiest one and the first step in probability analysis. After understanding it, be ensured, we will challenge it in other posts.
Imagine you are playing poker, you know that with American Airlines you have a 30% chance of winning at pre-flop. This is a theory, the GTO that uses probabilities of your hand and others’ hands and behaviors. However, if you do it once or twice, you won’t see this fact. Also, because this number is computed using some assumptions about other players’ rationality or experience/inexperience. In other words, you have to repeat the experience many times in some given conditions to retrieve something consistent with the theory. This is the practice.
In the perfect environment, if you play according to GTO many times, you are almost ensured to win. Even if a perfect environment doesn’t exist, keeping GTO in mind and adapting is a good strategy.
Using theory, even if it not perfectly applicable, is rather revolutionary. When our great ancestors played dice they were convinced that it was some divinity that drives the game. Our brain is naturally subject to bias and beliefs. We always hear “Last time this happened so I won’t do it anymore”.
For more about this, I suggest “Against the gods: the formidable history of risk” and “The capital ideas” from Bernstein.
A simple example :
Let’s get back to the matter at hand where the question is: if I buy a stock or any investment vehicle, how much can win, also how much can I lose, and how often? We will try to answer these questions with the first set of probability knowledge bundled around the concept of normal distribution. Hopefully, since data is available, financial market analysis is a good playground for this subject. But let’s explain it with a simple example first.
Data is presented in the form of a list of numbers: you have invested in a stock, say A, that increases by 22.2%, then 8.3%, etc. For 3 stocks it may be represented like :
The challenge is to map these observed values formalized as lists into how much probability some value can occur. We call this a distribution. We approximate that with a histogram. You already know that type of chart :
This simple representation gives a good idea about the center and the shape. We can deduce that:
Stock B center is in the right of stock A one. This means that B has more probability of given more returns.
Stock C is more dispersed than stock A. This means that C has more probability of having extreme returns.
As powerful as it is, the histogram is computed with a limited sample of values (our lists) it deals with a predetermined interval of values (bins) and their probabilities. Also, except comparing visually two distributions, that may be more complicated than our curated examples, we can’t say much.
We need a model, a mathematical way to deal more precisely with things unseen before. We want a curve, that links probability as a function of value, any value not a bin.
Why the normal distribution?
A common f is the normal distribution with a well-known formula and properties. In this context, common is a cherry-picked word since it is an omnipresent distribution to explain, at least approximate, many phenomena in the wild such as heights and weights… and stock returns.
Two reasons for that: one theoretical and one practical.
One big result of the probability theory, the central limit theorem (CLT), states: that if you choose a sample of values enough times, their sum, thus their means, converge to a normal distribution. The power behind this result is that the sample may have any distribution given than the samples are identically distributed and independent.
The normal distribution framework is famous because it is easy (common with well-stated computations and programs) to manipulate. This is nice in the context of portfolios when you want to mix multiple stocks.
Victim of its own success, it was originally called Gauss-Laplace distribution. The term “normal” was introduced by Karl Pearson, the man who established statistics the way we see it now. With some regrets since “Many years ago I called the Laplace-Gaussian curve the normal curve which name, while avoids an international question of priority, has the advantage of leading people that all other distributions of frequencies are in one sense “abnormal”. That belief is, of course, not justifiable. It has led many writers to try and force all frequency by the aid of one or another process of distortion into a “normal” curve. (paper read to Society of Biometricians and Mathematical Statisticians, June 14, 1920, quoted from Gilbert Saporta, Probabilités, analyse des données et statistique, Technip Éditions, 1990).
Keep in mind that we choose the normal distribution, since it is easy to use and with some theoretical foundations. You have other distributions in the wild, some fit better returns behavior. We will discuss that later.
Returns means and volatility
Armed with our normal distribution, we need two bullets. In fact, mathematically, the normal distribution depends on two parameters computed from the data.
The mean is given by the sum of returns divided by the number of values.
The standard deviations also called the volatility in finance literature. It is the squared root of the variance. The variance is the sum of the squared differences (squared deviations) between returns and the mean, divided by the number of values.
Here is an example of a normal distribution with multiples means and standard deviation. You can plot your own here.
You certainly heard about mean and standard deviation because you are curious or you had some undergraduate statistics courses. What is interesting that the mean and the standard deviation are derived mainly as parts of the normal distribution ecosystem. There are a lot of other tendencies and shape out there. However, being in the normal distribution framework offers a lot of convenient tools, with a model that is somehow a good fit for data.
Now we don’t have to know the list of values, just with the mean and the standard deviation (say your friend is on the other side of the planet and can have limited communication time) we can know if a stock is more valuable or risky. Thus, we see that A has more returns than B and riskier than C.
Say differently, we have some defined quantities, like KPI, that can make use make some actionable decisions. This is not the case of a histogram.
Let’s continue risk this way and quantify risk.
Risk analysis with normal distribution
For example, it is common that the interval of values around the mean by a standard deviation (mean +/- 1 standard deviation) represents 68% of the probability. As a result, any value (without further details) has 68% of the chances to be in this interval.
In the same manner, a random value has 95% (resp. 99%) chances to be in the interval mean +/- 1,96 standard deviations (resp. Mean +/- 2.33 standard deviations). This gives you a great idea of how you can earn and more importantly how you can lose just with two parameters.
Here is an example of a normal distribution with mean 0 and standard deviation 1 to depicted this relation between probabilities of events (slicing areas under the curves).
In risk analysis, we are interested are how much can I lose and how often. A stock having 5% return means and 10% volatility, assuming its returns are normally distributed, can observe a loss of 18% at 1% of probabilities, 11% at 5% of probabilities, and 8% at 10% of probabilities.