10 views (last 30 days)

Show older comments

Hi, I would like to create a weighted sample from an m by n matrix starting from an excel file ("DataTab", please see image attached). First column (UGT) represents the ID of the matrix and the column B-F represent the probability associated to the variable "fi" for each UGT.

"fi" is a scalar that has value 10, 20, 30, 40, 50: this means that UGT 1101 has 50% of probability to have value 10 or 20, the UGT 1102 can be 30 or 40, so on...

I used "randsample function" that works only for scalar, and i can not be able to use it for my situation.

fi = [10,20,30,40,50];

p = [0.15,0.20,0.30,0.10,0.25];

n=10000;

sample = randsample (fi, n, true, p);

hist(sample,fi);

bar(fi,n*p,0.7);

I put the entire code which works correcty with "Fv2" variable, not working with "Fv1" which is the goal of my question.

Nsample = 10;

DataTab = xlsread('Scenari_stabilita_R6.xlsx','S1','A2:f6');

Ugt = csvread('raster_ugt.acs');

UgtV = reshape(Ugt,[],1);

MRas = [UgtV];

MCycle = MRas(~idxNaN,:);

a=unique(MCycle(:,4));

idxDataTab=ismember(DataTab(:,1),a);

DataTab2 = DataTab(idxDataTab,:);

nUGT = length(a);

fi = [10,20,30,40,50];

Fv = [];

for i = 1:nUGT

Fv1 = randsample (fi, Nsample, true, DataTab2(i,:));

%Fv2 = (DataTab2(i,3)-DataTab2(i,2)).*rand(Nsample,1) +

%DataTab2(i,2); % this line calculates uniform distribution and

%it has to be modified into weighted sampling

Fv = [Fv,Fv1];

end

The "Fv1" variable must be like this (without first row that I show only as example to better understand):

Anyone can help me, please?

William Rose
on 2 May 2021

William Rose
on 2 May 2021

Function x=weightedRandomSample(v,p), attached, does what you have requested. It returns a random sample from vector v, where the probability of the element selected is given by a vector of probabilities, p.

The function first checks that p and v have the same length and that p sums to unity (within a small tolerance). Then the function generates a uniform random number between 0 and 1. Then the funciton uses that random number, and the probability vector p, to select an index in the vector. FInally, the function returns the element of v specified by that index.

Vector p is a probability density: the probability of selecting each element. The key step in function wieghtedRandomSample is to make vector pc, which is the probability distribution. The function then uses a uniform random number and vector pc to select the index for sampling.

Example 1: p=[.7, .2, .1, 0, 0]. Then pc=[.7, .9, 1, 1, 1]. Suppose r=rand()=0.72. Let i=1. Then check if r>pc(1). Yes it is, so i=i+1=2. Check if r>pc(2). It is not, so exit the While loop, with i=2. Return element 2 from the vector v.

Example 2: p=[.2, .2, .2, .2, .2]. Then pc=[.2, .4, .6, .8, 1]. Suppose r=rand()=0.53. Let i=1. Then check if rand>pc(1). Yes it is, so i=i+1=2. Check if rand>pc(2). Yes it is, so i=i+1=3. Check if r>pc(3). It is not, so exit the While loop, with i=3. Return element 3 from the vector v.

I have also included script weightedRandomSampleDemo.m, which demonstrates the use of function weightedRandomSample.m. The script weightedRandomSampleDemo has six different probability vectors (six rows in probArray). For each probability vector, it takes 1000 random samples of vector fi=[10 20 30 40 50]. Finally, it plots six histograms, showing the number of samples of each value of fi, for the six probability vectors. The plot which it creates, showing six histograms, is below.

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!