How to divide dataset into a test, train, split format?

12 views (last 30 days)
Hello,
I'm trying to split my dataset have the format X_train, X_test, y_train and y_test - in similar fashion to Python's test_train_split but I'm struggling to find a method to do so. Is this possible in MatLab?
I've tried doing the following
seed = 42;
rng(seed);
cv = cvpartition(size(dataset,1), "HoldOut", 0.2);
idx = cv.test;
X_train = subsample(~idx,:);
y_test = subsample(idx,:);
but I'm not entirely sure how to go about deriving X_test and y_train.
Does anybody have a good solution to this? Apologies as I'm fairly new to MatLab!
Thank you!

Accepted Answer

Ameer Hamza
Ameer Hamza on 4 Nov 2020
Does the variable subsample contains both 'X' and 'y' values? If yes, then you don't need to create two variables for X and 'y'. Just use
subsample_train = subsample(cv.training, :)
subsample_test = subsample(cv.test, :)
However, if subsample contains 'X' values and another variable (say, 'y') contain y values then you can do something like this
X_train = subsample(cv.training, :);
y_train = y(cv.training, :);
X_test = subsample(cv.test, :);
y_test = y(cv.test, :);
  6 Comments

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!