Information Theory in The Financial Markets
Insights and Analysis Using Kenyan Financial Market Data
Introduction
In 1948 Claude Shannon founded the field of Information theory in his paper titled, “A Mathematical Theory of Communication.” His interest was on how much information a communication channel could transmit.In the financial markets investors and analysts are interested in separating noise from information to aid in predicting the market movement and measuring risk exposure.
As defined by Shannon Information theory is the mathematical treatment of the concepts, parameters and rules governing the transmission of messages through communication systems. Simply put, Information Theory is concerned with transmitting and interpreting information in a noisy channel.
Before diving into the post I’ll start off with a couple of definitions which will guide the discussions here,The necessity of this is so that we can have it in mind that our reference point of the topic is in relation to the financial markets.
Information - it is the decrease in ambiguity regarding a phenomenon, an increment to our knowledge when we observe a specific event.
Uncertainty - it is something which is possible but unknown.The relation here is to what extent it is to predict the future.
Shannon Entropy’s - denoted as (Hx)
is a measure of the amount of uncertainty associated with a variable X when only its distribution is known i.e It is a measure of the amount of information we hope to learn from an outcome (observations).
There also certain assumptions that Information theory works within:
-
Likely events should have low information content, and in the extreme case, events that are guaranteed to happen should have no information content whatsoever.
-
Less likely events should have higher information content.
-
Independent events should have additive information. For example, finding out that a tossed coin has come up as heads twice should convey twice as much information as finding out that a tossed coin has come up as heads once.
Why do we need to measure the randomness of a series in the financial markets?
The securities markets reflects the interaction of many stakeholders buying or selling a particular security. Influenced by their beliefs and the economic situation, the stakeholders may feel inclined to buy or sell under markets with clear trends or act more erratically in times of uncertainty. Therefore, by quantifying the level of randomness of the Securities markets we can obtain insights on the general behavior of the participants.
Generally, it is common practice to relate the variance or the standard-deviation and the VaR (Value-at-Risk) as the main risk and uncertainty measures in financial markets. However, these measures can fail in specific situations as a measure of uncertainty, since they need the probability distributions to be symmetric and neglect the possibility of extreme events such as the existence of fat-tails which has been proved to occur in securities returns
lets do an illustration to see the shortcomings.
Assume we have two series A and B:
A = [0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1]
B = [0,1,1,0,1,0,0,0,1,1,0,1,1,1,1,0,0,0,1,0]
The two series have been produced randomly with the same probability and have the same values of mean and variance,Clearly we can see that it is easy to continue the pattern described in series A, but we would have to work harder to be able to predict the next number in series B.
I’ll use julia for the illustration bit.
1
2
3
4
5
6
7
8
9
10
11
using Statistics
using Distributions
A = [0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1]
B = [0,1,1,0,1,0,0,0,1,1,0,1,1,1,1,0,0,0,1,0]
##run the risk metrics
std(A)
std(B)
var(A)
var(B)
Series A is an alteration of zeros and ones, while series B does not have a clear pattern.
If we run the metrics of standard deviation and variances which are well known measures of risk, we get 0.51298917
and 0.263157
respectively for both series A and series B. You can go on and calculate kurtosis and the outcome would be equal for both the series despite the clear randomness we see, therefore the analysis of uncertainty using the methods of moment statistics is limited and cannot be used to guarantee the unceartainty of a series mathematically.
The probabilistic approach of analyzing the moments of different orders is not analyzing the randomness of a series but the randomness of the generation process of the series.What this means is, If we shuffle any of the previous two series we would obtain the same values using moment statistics. Since the calculations are independent of the organization, their terms are of the form: \(\sum_i (x_i - \bar{x})^a\) , where a is the order of the calculated moment.
Since entropy quantifies the amount of information, it also measures the degree of randomness in the data series.One of the characteristics when measuring the randomness of a series is to quantify the level of randomness so that the question would not be whether a data series is random or not, but how random it is.
we must note that entropy is not a function of the values of the variables but the probability itself and the property \(H (X, Y ) ≤ H (X) + H (Y )\) can bring some hope in this way.
Mathematical formualtion
Information Theory was designed as a theory to analyze the process of sending messages through a noisy channel and understanding how to reconstruct the message with a low probability of error.
The analysis of communications by telegraph led to the formula to measure information of the type \(𝐻= nlogS\) , where: $S$ is the number of possible symbols and $n$ the number of symbols transmitted. Shannon extended this formulation to bypass that restriction, proposing entropy as the measure of information of a single random variable.
If X can take the values \({x_1 ... x_n}\) and $p(x)$ is the probability associated with those values given \(𝑥∈𝑋\) , entropy is defined as:
\[H(X) = - \sum\limits_{x \in X} P(x) \log P(x)\]From this we can see the formula clearly depicts our point that entropy is based on the probability of occurrence of each symbol, therefore entropy is not a function of the values of the series themselves but a function of their probabilities.
Moving on we apply this to actual market data,We’ll use the Kenyan financial market as an example:
libraries used for this task:
1
2
3
4
5
6
using Pipe
using DataFrames
using Dates
using RollingFunctions
using Plots
using Impute
We’ll start off by calculating the daily returns as the data we have are prices.
1
2
3
4
5
function returnFunction(inputs::AbstractArray{Float64,1})
rets = (inputs[2:end])./inputs[1:end-1] .- 1
pushfirst!(rets, 0.0)
return rets
end
The next step would now be to formulate the model in that any day under examination period that the return is higher than 0 will be a price moving up indicator and consequently get the probability and the opposite for returns less than zero.
Note Entropy can be of a deterministic or stochastic form, We take the former for this article.
Secondly the above formulation could be approached in many different ways and this is just one among them.
Now we have the probabilities we can calculate the entropy for both price-up and price-down scenarios and add them up to get the total entropy.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
function ShanonEntropy(inputData::AbstractArray{Float64,1})
X = inputData
len = length(X)
#assign probabilities greater than zero is market up and lower than is market down.
p_up = sum(X .>= 0) /len
p_down = sum(X .< 0) / len
total = 0
if p_up > 0
total = total + p_up * log(2,p_up)
end
if p_down > 0
total = total + p_down * log(2,p_down)
end
return(-total)
end
At this point we have the various functions needed for the task lets go ahead and clean and organize the data into our expected format.
These are the securities we are going to use:
- Kenya Commercial Bank Ltd Ord 1.00
- Safaricom Ltd Ord 0.05
- East African Breweries Ltd Ord 2.00
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
assets = ["Safaricom Ltd Ord 0.05",
"Kenya Commercial Bank Ltd Ord 1.00",
"East African Breweries Ltd Ord 2.00"]
SecuritiesReturn =
begin
@pipe NSEMarketData |>
select(_, Not([:Low,:High])) |>
filter(x -> x.MarketDate >= Date("2015-01-01", "y-m-d") && x.MarketDate <= Date("2020-12-31", "y-m-d"), _) |>
filter(x -> x.Name in assets, _) |>
unstack(_, :Name, :MarketPrice) |>
rename(_, :MarketDate => :Date) |>
Impute.locf |>
transform(_,[:2,:3,:4] .=> ByRow(Float64),renamecols = false)|>
transform(_,[:2,:3,:4] .=> returnFunction, renamecols=false)|>
dropmissing(_)
end
Looking at the top rows of our data frame the data is good to go for crunching.
1
2
3
4
5
6
7
8
9
10
5×4 DataFrame
Row │ Date Kenya Commercial Bank Ltd Ord 1.00 Safaricom Ltd Ord 0.05 East African Breweries Ltd Ord 2.00
│ Date Float64 Float64 Float64
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 2015-01-02 0.0 0.0 0.0
2 │ 2015-01-05 -0.00877193 -0.00706714 0.00331126
3 │ 2015-01-06 -0.00884956 -0.0106762 0.0132013
4 │ 2015-01-07 0.0 0.0 -0.00651466
5 │ 2015-01-08 0.0 -0.0107914 0.00983607
At this point the Next step is to calculate the 20 day Shannon’s moving entropy to see which among our selected securities are as random or has the highest levels of uncertainty.
we begin with the whole period entropy before proceeding to the 20-day moving entropy.
1
2
3
4
ShanonEntropy(IndexReturns.NseAllshare)
ShanonEntropy(SecuritiesReturn[:,Symbol("Safaricom Ltd Ord 0.05")])
ShanonEntropy(SecuritiesReturn[:,Symbol("Kenya Commercial Bank Ltd Ord 1.00")])
ShanonEntropy(SecuritiesReturn[:,Symbol("East African Breweries Ltd Ord 2.00")])
The output we get are 0.9544,0.9733 and 0.9825 respectively. The interpretation is quite direct - The expected values from entropy measurement range from 0 - 1
and so the higher the bit the higher the uncertainty. From this we can comfortably say for the selected period the securities showed high levels of randomness as they were a couple of decimal points shy of 1.0.
1
2
3
4
5
6
7
8
#Calculte the 20day moving average of the shanons Entropy
MarketDate = SecuritiesReturn[:,:Date]
EntropyKCB = rolling(ShanonEntropy, SecuritiesReturn[:,Symbol("Safaricom Ltd Ord 0.05")],20)
EntropyBAT = rolling(ShanonEntropy, SecuritiesReturn[:,Symbol("Kenya Commercial Bank Ltd Ord 1.00")],20)
EntropySaf = rolling(ShanonEntropy, SecuritiesReturn[:,Symbol("East African Breweries Ltd Ord 2.00")],20)
lowerLmt1 = length(MarketDate) - 1
uppeLmt1 = length(EntropyKCB)
deleteat!(MarketDate,uppeLmt1:lowerLmt1)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
DfSecuritiesEntropy = DataFrame(Date = MarketDate,
KCBEntropy = EntropyKCB,
BATEntropy = EntropyBAT,
SafEntropy = EntropySaf)
#Plot the out put against the time
plot2 = plot(DfSecuritiesEntropy[:,:Date],DfSecuritiesEntropy[:,:KCBEntropy],
title = "KCB Share Rolling 20 Day Shannon Entropy vs Time",
xlabel = "Year",
ylabel = "Rolling 20 Day Shannon Entropy")
#Plot BAT moving 20 shanons Entropy
plot3 = plot(DfSecuritiesEntropy[:,:Date],DfSecuritiesEntropy[:,:BATEntropy],
title = "BAT Share Rolling 20 Day Shannon Entropy vs Time",
xlabel = "Year",
ylabel = "Rolling 20 Day Shannon Entropy")
#Plot BAT moving 20 shanons Entropy
plot4 = plot(DfSecuritiesEntropy[:,:Date],DfSecuritiesEntropy[:,:SafEntropy],
title = "Safaricom Share Rolling 20 Day Shannon Entropy vs Time",
xlabel = "Year",
ylabel = "Rolling 20 Day Shannon Entropy")
plot(plot2, plot3,plot4, layout = (3, 1), legend = false,size = (1000, 1000))
Coming back to the Shannon’s 20-day moving entropy we can observe times when the securities were volatile and when they weren’t.(You can play around with this and change the number of days and see the behavior). This can help in deciding when best to use a predictive model and also risk evaluation.Whenever the security shows a pattern of less entropy that is a value below 0.5 like for safaricom somewhere between 2016 and 2017 the entropy was hitting the 0.5 mark meaning the period wasn’t filled with as much randomness,from that you can have some level of confidence in your predictive model. Also, whenever there is an extended period of randomness you can gauge the risk involved.The most indispensable thing is having this information prior to give the analyst or investor an advantage,Methods like Montecarlo simulation and integration could be used to get the n days ahead and gauge the nature of the series ahead of time.
There are more superior aspects of entropy that can be tackled to increase precision such as Approximate Entropy(ApEn).
Conclusion
Note: With Shannon’s definition, events with high or low probability do not contribute much to the value of the measure, as shown in the graph for values of p = 0 or p = 1.
Having tried to show how information theory can aid in making investment decisions we can say that evaluating the randomness of a series of data is clearly an affair of Identifying the system as either stochastic or deterministic. Specifically to which extent the data behaves as stochastic and how much determinism exists.
If there is no underlying deterministic process when analyzing a particular series of data, it will imply the strictest random movement. However, if there are cycles, trends, and patterns, then the data series would not be totally random making It necessary to quantify the degrees of randomness.
Reference: