How can I do a 80-20 split on datasets to obtain training and test datasets?

31 views (last 30 days)
I tried [training, test] = partition (faceDatabase, [0.8, 0.2]); but it gives me error. Can anyone help? Are there ways to do this manually? I can't find a function for this!

Accepted Answer

KSSV
KSSV on 15 Mar 2018
Let P and T be your input and target sets.
PD = 0.80 ; % percentage 80%
Ptrain = P(1:round(PD*length(T)),:) ; Ttrain = T(1:round(PD*length(T))) ;
Ptest = P(round(PD*length(T)):end,:) ;Ttest = T(round(PD*length(T)):end) ;
  2 Comments
Chidiebere Ike
Chidiebere Ike on 15 Mar 2018
Edited: Chidiebere Ike on 15 Mar 2018
I tried the code, it says "undefined function or variable T"... I will appreciate if you describe the letter P, T and length ... How do I resolve this. ?

Sign in to comment.

More Answers (2)

Akira Agata
Akira Agata on 15 Mar 2018
Edited: Akira Agata on 15 Mar 2018
If you want to randomly select 80% of your data as training dataset, please try following:
PD = 0.80 ; % percentage 80%
% Let P be your N-by-M input dataset
% Solution-1 (need Statistics & ML Toolbox)
cv = cvpartition(size(P,1),'HoldOut',PD);
Ptrain = P(cv.training,:);
Ptest = P(cv.test,:);
Another possible solution:
% Solution-2 (using basic MATLAB function)
N = size(P,1);
idx = randperm(N);
Ptrain = P(idx(1:round(N*PD)),:);
Ptest = P(idx(round(N*PD)+1:end),:);
  1 Comment
Chidiebere Ike
Chidiebere Ike on 15 Mar 2018
Solution 1 gives an error message.. Error in cvpartition CV.Impl = internal.stats.cvpartitionInMemoryImpl(varargin{:});

Sign in to comment.


Munshida P
Munshida P on 14 Jan 2020
This will help you.
[training,test] = partition(faceDatabase,[0.8 0.2]);

Categories

Find more on Text Analytics Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!