MATLAB Answers

datasample/bootstrap procedure

12 views (last 30 days)
Basically I need to ensure for all rows in the variable t1, that they are resampled in the same way.
That means that if the first draw is observation 2, then for all t1, I need observation 2 to be the first draw. This continues, such that I will have b resamples that resamples so that the samples are reordered, but for all t1 they has to be re-ordered the same way. By using rng I achieve this. I also need the variable "carhartt" to follow the same procedure.
However, I am questioning whether there is a better method for doing this? It is very important that i resample all t1 in the same way.
It is also important that each bootstrap/datasample is random.
Do any of you experts have a better solution, or do you approve of the one I use here?
I am trying to bootstrap/resample the best appropriate way:
nans = any(isnan(t1),1);
for i =1:size(t1,2)
for B=1:b


Guillaume on 14 Mar 2020
There's a lack of context and a complete lack of comment in the code that makes it hard to understand your question.
"Basically I need to ensure for all rows in the variable t1 [...]" What does t1 represent?
"[...]that they are resampled in the same way" What sort of resampling are you talking about
"That means that if the first draw is observation 2[...]" What draw? What observation?
Anton Sørensen
Anton Sørensen on 14 Mar 2020
Hi, I am sorry for that, didn't think it was important in relation to creating the resample.
I will explain each variable. T1 represents mutual funds performance in the first 12 months of the original sample, and I will use the residuals for each mutual fund in the resample.
Carhartt is the benchmark data for the first 12 month of the sample.
I will illustrate what I need in the following:
If the first resample for mutual fund 1' residuals are drawn from the following observations: Residual(1),Residual(3), Residual(10), Residual(8), Residual(3), Residual(8), Residual (10), Residual (10), Residual (12), Residual(1), Residual(1) and Residual (12) Then I basically need that the first draw for both all mutual funds and the Carhartt benchmark data are drawn in the same way, such that the orders are equal. Meaning, that fund 10 for instance draw the same residuals, but from its own residual distribution (same goes for carhart benchmarks)
Does it make more sense now?
Anton Sørensen
Anton Sørensen on 14 Mar 2020
And important to mention, rng(B) seem to do this, however the resample procedure will not be totally random, since it will dependent on seed B=1...b.
So I wondered if this is an approved solution, or if there is a better way?

Sign in to comment.

Accepted Answer

Guillaume on 14 Mar 2020
In the following, I'm assuming that residuals and carhartt are both matrices. It looks like they are from your code. If not the code should be slightly different but the principle still stands.
However, I find it a bit strange that you're drawing K rows with replacement from a matrix with K rows. I would have expected the 2 K to be different.
Anyway, instead of drawing the samples directly, draw their row indices. You can then use the indices for both matrices:
samplerows = datasample(1:size(residuals, 1), size(residuals, 1));
%or without any need of the stat toolbox:
%sampleindices = randi(size(residuals, 1), size(residuals, 1)); %1st size is the size of the input, 2nd size is the number of samples
bootresiduals = residuals(samplerows, :);
bootcarhart = carhartt(samplerows, :);


Anton Sørensen
Anton Sørensen on 14 Mar 2020
Hi again,
Thanks for your answer, brilliant solution.
There is one problem related to this.
Since, I in some of my tests, will have mutual funds that doesn't have data for the entire sample period, they contain NaN's. How do I take care of that in the solution you have provided?
Now I have my resampled residuals for each fund, which is a matrix.
I have attached two pictures of how the outcome looks.
Since I need to obtain OLS estimates on my resampled residuals, do you have a suggestion on how to do this for each resample and for all mutual funds?
I have a function called "ols", where I basically just insert my variables, ie: results=ols(residuals,carhart);
Thanks again for the much appreciated solution.
Guillaume on 14 Mar 2020
I know nothing about finance computation, so have no idea what "OLS estimates on my resampled residuals" actually mean and don't have any inkling as to what you're asking.
Anton Sørensen
Anton Sørensen on 14 Mar 2020
No problem.
Thanks alot for your help.
I might have a future problem on this datasampling procedure, since some of my funds doesn't have data for the entire sample periods and therefore I am dealing with NaN.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!