More Campus Resources

Useful Tools and Information

Language

Multilingual content from IBKR

# Central Limit Theorem explained in Python (with examples) – Part IV

###### Posted April 5, 2021 at 3:40 pm
Ashutosh Dave
QuantInsti

See Part I,  Part II and Part III to get started.

In case 2, we will repeat the above process, but with a much larger sample size (n=500):

# drawing 50 random samples of size 500
sample_size=500

df500 = pd.DataFrame()

for i in range(1, 51):
exponential_sample = np.random.exponential((1/rate), sample_size)
col = f’sample {i}’
df500[col] = exponential_sample

df500_sample_means = pd.DataFrame(df500.mean(),columns=[‘Sample means’])
sns.distplot(df500_sample_means);

The sampling distribution looks much more like a normal distribution now as we have sampled with a much larger sample size (n=500).

Let us now check the mean and the standard deviation of the 50 sample means:

#The first 5 values from the 50 sample means

We can observe that the mean of all the sample means is quite close to the population mean (μ=4)(μ=4).

Similarly, we can observe that the standard deviation of the 50 sample means is quite close to the value stated by the CLT, i.e., (σ/√n)(σ/n) = 0.178.

# An estimate of the mean of the sampling distribution can be obtained as:
np.mean(df500_sample_means).values[0]

0.18886796530269118

# The above value is very close to the value stated by the CLT, which is:
sd/ np.sqrt(sample_size)

0.17888543819998318

Thus, we observe that the mean of all sample means is very close in value to the population mean itself. Also, we can observe that as we increased the sample size from 2 to 500, the distribution of sample means increasingly starts resembling a normal distribution, with mean given by the population mean μμ and the standard deviation given by (σ/√n)(σ/n), as stated by the Central Limit Theorem.

Stay tuned for the next installment, in which Ashutosh will go over the code for binomially distributed population.