Monte Carlo Simulation to test for the correlation between two dataset in MATLAB (codes included)

Utpal Kumar   1 minute read      

We test for the correlation coefficients or the covariance between two sets of random numbers selected from normal distribution using the Monte Carlo simulations.

Monte Carlo Simulations (MCS) can be used to extract important informations from the dataset that would be impossible to assess otherwise. Using MCS rather than the traditional methods to find the relation between two datasets are more intuitive.

What is Monte Carlo Simulations??

MCS studies are computer-driven experimental investigations in which certain parameters, such as population means and standard deviations that are known a priori, are used to generate random (but plausible) sample data (Mooney 1997). These generated data are then used to evaluate the sampling behavior of one or more statistics of interest. This process of generating and analyzing data is repeated over many iterations and differing conditions that are thought to influence the sampling behavior of the statistic of interest (e.g., through increasing sample size, mean differences, variability). 3

%%Monte Carlo simulations of correlation values
clear; close all; clc;
% define
numsim = 10000;   % number of simulations to run
samplesize = 50;  % number of data points in each sample

% pre-allocate the results vector
results = zeros(1,numsim);

% loop over simulations
for num=1:numsim

  % draw two sets of random numbers, each from the normal distribution
  data = (randn(samplesize,2).^2)*10+20;

  % compute the correlation between the two sets of numbers and store the result
  results(num) = corr(data(:,1),data(:,2));

end
% visualize the results
figure; hold on;
hist(results,100);
% ax = axis;
% mx = max(abs(ax(1:2)));  % make the x-axis symmetric around 0
% axis([-mx mx ax(3:4)]);
xlabel('Correlation value');
ylabel('Frequency');
%%
val = prctile(abs(results),95);
val
%%

% visualize this on the figure
ax = axis;
h1 = plot([val val],ax(3:4),'r-');
h2 = plot(-[val val],ax(3:4),'r-');
legend(h1,'Central 95%');
title(sprintf('The values between which most of the correlation values lie is +/- %.4f',val));
saveas(gcf,"monteCarloSim",'pdf')
%%

References:

  1. Lectures on Statistics and Data Analysis in MATLAB
  2. Assessing the Significance of the Correlation between Two Spatial Processes
  3. Teaching Statistics With Monte Carlo Simulation

Disclaimer of liability

The information provided by the Earth Inversion is made available for educational purposes only.

Whilst we endeavor to keep the information up-to-date and correct. Earth Inversion makes no representations or warranties of any kind, express or implied about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services or related graphics content on the website for any purpose.

UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE. ANY RELIANCE YOU PLACED ON SUCH MATERIAL IS THEREFORE STRICTLY AT YOUR OWN RISK.


Leave a comment