MATLAB Answers

How to make weighted random sampling for matrix

10 views (last 30 days)
Michele Pio Papasidero
Michele Pio Papasidero on 24 Apr 2021
Answered: William Rose on 2 May 2021
Hi, I would like to create a weighted sample from an m by n matrix starting from an excel file ("DataTab", please see image attached). First column (UGT) represents the ID of the matrix and the column B-F represent the probability associated to the variable "fi" for each UGT.
"fi" is a scalar that has value 10, 20, 30, 40, 50: this means that UGT 1101 has 50% of probability to have value 10 or 20, the UGT 1102 can be 30 or 40, so on...
I used "randsample function" that works only for scalar, and i can not be able to use it for my situation.
fi = [10,20,30,40,50];
p = [0.15,0.20,0.30,0.10,0.25];
n=10000;
sample = randsample (fi, n, true, p);
hist(sample,fi);
bar(fi,n*p,0.7);
I put the entire code which works correcty with "Fv2" variable, not working with "Fv1" which is the goal of my question.
Nsample = 10;
DataTab = xlsread('Scenari_stabilita_R6.xlsx','S1','A2:f6');
Ugt = csvread('raster_ugt.acs');
UgtV = reshape(Ugt,[],1);
MRas = [UgtV];
MCycle = MRas(~idxNaN,:);
a=unique(MCycle(:,4));
idxDataTab=ismember(DataTab(:,1),a);
DataTab2 = DataTab(idxDataTab,:);
nUGT = length(a);
fi = [10,20,30,40,50];
Fv = [];
for i = 1:nUGT
Fv1 = randsample (fi, Nsample, true, DataTab2(i,:));
%Fv2 = (DataTab2(i,3)-DataTab2(i,2)).*rand(Nsample,1) +
%DataTab2(i,2); % this line calculates uniform distribution and
%it has to be modified into weighted sampling
Fv = [Fv,Fv1];
end
The "Fv1" variable must be like this (without first row that I show only as example to better understand):
Anyone can help me, please?
  1 Comment
William Rose
William Rose on 2 May 2021
@Michele Pio Papasidero, please upload the files Scenari_stbilita_r6.xlsx and Scenari_stabilita_R6.xlsx and raster_ugt.acs so that we can run your code. Thanks.

Sign in to comment.

Answers (1)

William Rose
William Rose on 2 May 2021
Function x=weightedRandomSample(v,p), attached, does what you have requested. It returns a random sample from vector v, where the probability of the element selected is given by a vector of probabilities, p.
The function first checks that p and v have the same length and that p sums to unity (within a small tolerance). Then the function generates a uniform random number between 0 and 1. Then the funciton uses that random number, and the probability vector p, to select an index in the vector. FInally, the function returns the element of v specified by that index.
Vector p is a probability density: the probability of selecting each element. The key step in function wieghtedRandomSample is to make vector pc, which is the probability distribution. The function then uses a uniform random number and vector pc to select the index for sampling.
Example 1: p=[.7, .2, .1, 0, 0]. Then pc=[.7, .9, 1, 1, 1]. Suppose r=rand()=0.72. Let i=1. Then check if r>pc(1). Yes it is, so i=i+1=2. Check if r>pc(2). It is not, so exit the While loop, with i=2. Return element 2 from the vector v.
Example 2: p=[.2, .2, .2, .2, .2]. Then pc=[.2, .4, .6, .8, 1]. Suppose r=rand()=0.53. Let i=1. Then check if rand>pc(1). Yes it is, so i=i+1=2. Check if rand>pc(2). Yes it is, so i=i+1=3. Check if r>pc(3). It is not, so exit the While loop, with i=3. Return element 3 from the vector v.
I have also included script weightedRandomSampleDemo.m, which demonstrates the use of function weightedRandomSample.m. The script weightedRandomSampleDemo has six different probability vectors (six rows in probArray). For each probability vector, it takes 1000 random samples of vector fi=[10 20 30 40 50]. Finally, it plots six histograms, showing the number of samples of each value of fi, for the six probability vectors. The plot which it creates, showing six histograms, is below.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!