Physical Meaning of Marcenko Pastur Theorem

 This post will describe the physical meaning of the Marcenko Pastur Theorem (MPT)

Physical meaning:
The Theorem allows us to predict the probability density function (PDF) of the eigen values of a covariance matrix. The covariance matrix is associated with an underlying process that has zero mean, has variance sigma^2, and comes from a process that is IID (Independent and Identically Distributed). 

Practical use:
The IID idea is crucial because it allows us to use the MPT as a template for the expected PDF when a process only has random noise, and not underlying signal. (PDF is for eigen value of the covariance matrix of an underlying process, adjusted to be with zero mean, and variance sigma^2). We can then determine the % of signal that is present in our observations, and can localise the signal to a certain eigen value and above, so we can then try to extract it. 

MPT works best when you have at least 10X observations for every independent variable. The higher the X the better the distribution

Results:
When you run the code below, you can see that:
1. The Eigen values will always be positive
2. The PDF will be centered around the value of the variance, and will almost be symetrically distributed
3. At higher variances, you will have peakier distributions, at lower variances, you will have almost a uniform distribution



Compute using Python:
import numpy as np, pandas as pd
import matplotlib.pyplot as plt
# ---------------

def mpPDF(var,q,pts):
#q = T/N
eMax = var*(1+((1./q)**.5))
eMin = var*(1-((1./q)**.5))
eVal = np.linspace(eMin,eMax,pts)
pdf = (q/(2*np.pi*eVal*var))*(((eMax-eVal)*(eVal-eMin))**.5)
pdf = pd.Series(pdf,index=eVal)
return pdf

pdf0 = mpPDF(var=1.,q=100000/1000,pts=1000)
pdf1 = mpPDF(var=2.,q=100000/1000,pts=1000)
pdf2 = mpPDF(var=0.5,q=100000/1000,pts=1000)

fig, ax = plt.subplots()
plt.plot(pdf0,'r',label='Variance 1')
plt.plot(pdf1,'g',label='Variance 2')
plt.plot(pdf2,'b',label='Variance 0.5')
ax.set_xlabel('Eigen Value')
ax.set_ylabel('Probability of Eigen Value')
ax.legend()
plt.show()

Comments

Popular posts from this blog

1.2 Structured Data: Information Driven Bars

2.2 Labeling: Triple barrier method

Denoise a Covariance Matrix using Constant Residual Eigen Value Method